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PREFACE 


The 1992 Annual Review of Public Health continues to meet the important 
need, increasingly recognized, to bridge academic public health and pre- 
ventive medicine with public health practice and clinical preventive services. 

We see growing attention to prevention and to a population-based orienta- 
tion to health, health care needs, and costs among health care professionals 
and among health policy analysts and policymakers. We believe that the 
Annual Review of Public Health fuels this movement with essential sub- 
stantive material. 

We are working with ‘a committee and the staff of the American College of 
Preventive Medicine to help them utilize the articles in the Annual Review of 
Public Health in their Continuing Education and Self-Assessment program. 
We believe that these volumes can be similarly used by public health schools 
and their constituent departments and by public health practitioners and their 
organizations. 

In organizing the text, we have experimented this year with grouping the 
articles by heading. We have long organized the cumulative table of contents 
under Age and Disease Specific, Behavioral Aspects of Health, Environmen- 
tal Health, Epidemiology/Biostatistics, and Health Services. The task is 
complicated by the fact that certain topics and certain articles deliberately 
combine public health areas, such as epidemiology and health services, or 
epidemiology and environmental health. This year, we have added the head- 
ing of Public Health Practice to give greater visibility to our commitment to 
bridging academia and practice, and to stimulate suggestions of topics and 
authors from a broadened array of sources, especially our readers. We 
welcome your comments. 

Despite increased attention to developing country topics and more in- 
vitations to authors from other countries, we have not used the heading 
International Health. These articles cut across the headings, and a separate 
heading would be confusing. Also, we believe that many, if not most, of our 
articles have international relevance. Do look through the cumulative table of 
contents at the end of this text and “browse” beyond your own specialty field. 
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INTRODUCTION 


As of January 1990, the World Health Organization (WHO) estimates that 
8-10 million persons are infected with human immunodeficiency virus (HIV) 
worldwide. Approximately 3 million women, mostly of reproductive age, and 
more than 500,000 infants and children are infected (35). Eighty percent of 
these infected women and children reside in sub-Saharan Africa, where the 
estimated prevalence of HIV infection is 2500/100,000 women aged 15-49 
(34). In some African cities, HIV prevalence rates of up to 30% have been 
documented (114, 142). As heterosexual transmission of HIV increases in 
other areas of the world, the numbers of infected women and, consequently, 
their children also increase (124). In Latin America, an estimated 200,000 
women are infected, with a prevalence of 200/100,000 women aged 15—49 
years (34). There is a rapid increase in HIV infection among drug users and 
prostitutes in some Asian countries (36). In the United States, women com- 
prise 10% of the 171,865 adult cases of AIDS reported to the Centers for 
Disease Control as of May 1, 1991 (23). In 1991, AIDS was the fifth leading 
cause of premature death in women aged 15-49; in New York City, AIDS 
was the leading cause of death for women aged 20—40 (31, 38). 


'The US Government has the right to retain a nonexclusive royalty-free license in and to any 
copyright covering this paper. 
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The World Health Organization estimates that the HIV pandemic will kill 3 
million or more women and 2.7 million children worldwide during the 1990s 
(34, 37). AIDS will become the leading cause of death for women aged 15—49 
in major cities throughout the Americas, western Europe, and sub-Saharan 
Africa, with infant and child mortality rates as much as 30% greater than 
previously projected. In addition, it is estimated that up to 5.5 million 
children under 15 will be orphaned because of the premature death of their 
HIV-infected mothers and fathers from AIDS (34, 130). 


EPIDEMIOLOGY OF HIV IN WOMEN AND INFANTS 


Because more than 90% of HIV-infected children acquired their infection 
perinatally, the incidence of HIV infection in infants and children is de- 
pendent upon the prevalence of HIV infection in women of reproductive age, 
the fertility rate of these women, and the risk of perinatal transmission. 
Because the latter two factors appear to be highly variable among women in 
different populations, the overall rate of perinatal infection is difficult to 
predict. The following section discusses the prevalence of HIV infection in 
women, the associated risk factors for HIV acquisition in women, and how 
these variables influence perinatal transmission of HIV. Acquisition of HIV 
through contaminated blood or blood products also remains a risk in many 
parts of the world; therefore, we also present data regarding this additional 
mode of transmission. 


AIDS Surveillance 


Globally, there has been a marked increase in the number of female AIDS 
cases. In sub-Saharan Africa and some parts of the Caribbean, the male-to- 
female ratio for AIDS cases is 1:1, primarily as a result of heterosexual 
transmission (69, 126, 135, 136). In developed countries, such as those in 
North America and Europe, the number of AIDS cases diagnosed in women is 
still fewer than male cases; however, the number is increasing at a faster rate 
each year because of intravenous (IV) drug use and heterosexual transmis- 
sion. For example, in the US, the number of AIDS cases diagnosed in women 
aged 18-44 increased 29% from 1988 to 1989, as compared with an increase 
of 18% in men in the same age group (21). In 1991, 48% of female AIDS 
cases acknowledged IV drug use; 35% acknowledged heterosexual contact 
with an individual at risk for HIV; 7% had a history of receipt of blood 
transfusion; and 11% were listed as other or undetermined (23), including 
individuals who may have acquired HIV infection within health care settings, 
whose mode of exposure is unknown, and who may still be under investiga- 
tion, have died, were lost to follow-up, or refused interview. 

In the US, more than 3000 children have been reported to the Centers for 
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Disease Control (CDC) with AIDS: 88% acquired it from birth to 4 mother 
known to be at risk for HIV infection, 5% from a blood transfusion con- 
taminated with HIV infection, 4% from factor 9 concentrates, and 4% from an 
undetermined source. In most other countries, more than 90% of children 
with AIDS acquired their infection from birth to an infected mother. In 
countries with evidence of heterosexual transmission and a male-to-female 
ratio approximating one, infants and children with AIDS may comprise as 
much as 20% of the total number of AIDS cases reported in national sur- 
veillance (136, 138). The demographic characteristics of AIDS cases in 
women and in children with perinatally acquired infection primarily reflect 
the characteristics of groups at risk for infection, especially IV drug users. In 
the US, 59% of perinatally acquired AIDS cases are among black children and 
26% are in Hispanic children; their cumulative AIDS incidence rates are 21 
and 13 times, respectively, the incidence rates in white children (25, 29). In 
parts of New York and New Jersey, most IV drug users in treatment are black 
or Hispanic and live in poor inner city communities, where the prevalence of 
HIV infection among these drug users is nearly 50% (28, 44, 45). Fifteen 
metropolitan areas, mostly along the East Coast, which include only 18% of 
the US pediatric population, account for 70% of the perinatal cases (25). 

Although the greatest number of pediatric AIDS cases occurs in the first 
year of life, the relative impact of AIDS as the cause of death has been most 
striking in the 1-4 year age group: By 1990, AIDS was the leading cause of 
death among Hispanic children, and the second leading cause among black 
children in the US. 

Increasing AIDS-related adult mortality in Africa, as recently documented 
(43), is creating a large and growing number of children under age 15 whose 
mothers have died of AIDS. During the 1990s, AIDS will kill 1.5—2.9 million 
women of reproductive age in Central Africa, thus producing 3.1—5.5 million 
AIDS orphans (130), 6-11% of the population under 15. In these countries, 
where 20% of mothers are HIV-infected, childhood mortality under five years 
of age will rise from 100/1000 live births to 136/1000, thereby negating or 
reversing the gains of childhood survival achieved in the past few decades. 
Similar large numbers of orphans are predicted in the Caribbean and several 
urban centers of the US. Many of these children will be driven to prostitution 
for survival, thus enhancing further transmission of HIV in adolescents. They 
are joining those now referred to by the United Nations Children’s Fund as 
“children in extremely difficult circumstances,” which includes children en- 
dangered by armed conflict and other disasters, those exploited by child labor, 
street children, and children who are victims of abuse and neglect (71, 130). 
Although the phenomenon of AIDS orphans is also affecting Western cities 
like New York, the predominance of heterosexual transmission and absolute 
number of parents infected with HIV make this problem considerably greater 
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in Africa (34). As a result, national and international government and 
nongovernment service providers in Africa need to recognize this potential 
impact of HIV infection on children, expand AIDS prevention efforts, and 
develop policies and programs to address children’s AIDS-related needs. 


HIV Prevalence in Women 


Because AIDS case reporting is variable and subject to a variety of problems, 
such as underreporting and difficulties with case definition, seroprevalence 
studies better reflect the real magnitude of HIV infection. Knowledge of the 
general prevalence and possible incidence of HIV infections is essential to 
monitor the epidemiologic patterns and scope of the HIV pandemic (35). 
Estimates of the number of future cases of HIV-related disease, including 
AIDS, will be dependent upon the number of persons currently infected with 
HIV. However, seroprevalence data must also be interpreted with caution, 
because of the differences in methods in the populations surveyed. Local or 
regional findings regarding HIV seroprevalence cannot be generalized to the 
national level, and the extraordinary cultural diversity of many countries 
should limit any unwarranted extrapolations from small, more intensely 
studied groups to large populations. 

A variety of seroprevalence studies have attempted to estimate the frequen- 
cy of HIV infection in women of reproductive age (Table 1). Female appli- 
cants for US military service are routinely tested and have shown a fairly 
stable seroprevalence rate nationally of 0.06%, although rates are much 
higher in certain inner cities of the Northeast: approximately 0.5% in northern 
New Jersey, New York City, and San Juan, Puerto Rico (27, 32, 42, 114). 
Seroprevalence rates in black and Hispanic female military applicants are 
eight and four times higher, respectively, than those among white applicants. 
Seroprevalence among first-time female blood donors is approximately 
0.01%. Blinded antenatal screening and surveys of women delivering babies 
have also documented variable rates in different cities (114). However, 
several of these studies have shown that many seropositive women do not 
acknowledge or know they have risk factors for infection. For instance, 
among women delivering babies at a New York City hospital (92) and at the 
Johns Hopkins Hospital in Baltimore (4, 5), between one third and one half of 
the seropositive women had no reported risk for HIV infection. In other 
words, they were likely infected through heterosexual contact with a partner 
they did not recognize to be infected or at increased risk. Similar studies in 
sexually transmitted disease (STD) clinics have documented increasing rates 
among women who may have been infected through heterosexual contact with 
a partner of unknown risk. In the CDC blinded HIV surveys, seroprevalence 
of HIV in more than 100,000 women attending STD clinics was 2.2% (21). 
Median seroprevalence rates by clinic type for women attending prenatal, 
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family planning, and drug-treatment clinics were 0.9%, 0.5%, and 3.7%, 
respectively (21). In 1988, a national survey of 2 million childbearing women 
per year was initiated in 44 states, the District of Columbia, and Puerto Rico 
to measure the prevalence of HIV infection among women delivering infants 
over time. These data will be useful in developing, targeting, and evaluating 
appropriate education and prevention programs. Thus far, the highest sero- 
prevalence rates have been in New York (0.58%, with 1.25% in New York 
City and 0.16% upstate), the District of Columbia (0.55%), New Jersey 
(0.49%), and Florida (0.49); most states have overall rates under 0.1%. The 
estimated national rate was 0.15%, which corresponds to 5500-6000 HIV- 
infected women delivering liveborn infants in 1989. If 30% is the rate of 
perinatal transmission, 1600—1800 of these children were infected as a result 
of maternal infection in 1989. This number is three times the number of 
children reported with perinatally acquired AIDS in 1989, which suggests that 
the future number of pediatric cases will be even higher. Rates of |-4% have 
been documented in blacks and Hispanic childbearing women in these sur- 
veys, which clearly reflect the impact of HIV infection in minority pop- 
ulations. 

In developing countries, antenatal surveys for HIV among apparently 
healthy women of childbearing age reveals that a surprisingly large proportion 
of those women living in urban areas in some countries have high rates of HIV 
infection. For example, in Port-au-Prince, Haiti, the rate of HIV infection in 
pregnant women rose from 8% to more than 10% between 1982 and 1988 
(14). In African cities, seroprevalence rates of 5—30% have been documented 
among women who attend antenatal clinics (17, 59, 67, 90, 95, 105, 126, 
137, 142, 143). Rates of HIV infection have risen from 0% in 1980 to 3% in 


Table 1 Seroprevalence of HIV-1 infection 
in antenatal women 








Location Number tested Rate 





Rwanda 900 30.3% 
Uganda 497 24.3% 
Rwanda 3891 23.1% 
Burundi 1255 17.5% 
Zambia 1954 11.6% 
Kenya 2400 7.1% 
Zaire 1491 6.0% 
New York 276,609 0.66% 
Massachusetts 30,708 0.26% 
United States >2 million 0.15% 
London 114,515 0.15% 
Italy 23,491 0.024% 
Sweden 130,508 0.013% 











6 QUINN, RUFF & MODLIN 


1988 in Nairobi (126, 127), and from 0.2% in 1970 to 8% in 1986 in 
Kinshasa, Zaire (142). 


Heterosexual Transmission 


Sexual behavior, exposure to an HIV-infected individual, and a history of 
STDs appear to be the major risk factors for HIV infection in both men and 
women. In some developing countries, urban prostitutes, who have a high 
infection rate (18-86%), played a prominent role in the initial dissemination 
of HIV in many parts of the world (39, 88, 126-128, 135, 148, 156). 
However, even among African prostitutes, the presence of STDs appears to 
be strongly associated with HIV transmission (122). In Nairobi, a prospective 
study of 124 HIV seronegative African prostitutes documented HIV 
seroconversion in 83 (67%) (128). Oral contraceptive use, genital ulcers, and 
Chlamydia trachomatis cervical infection were each independently associated 
with increased risk of HIV infection. Condom use reduced the risk of HIV 
infection. Of seroconverting women, 60% experienced one or more episodes 
of genital ulcers in the period before seroconversion, compared with 45% of 
HIV seronegative women. This relationship became stronger when the num- 
ber of ulcer episodes was adjusted for length of follow-up. The mean number 
of annual ulcer episodes was 1.32 + 0.55 in seroconverting women, com- 
pared with 0.48 + 0.21 in seronegative women (p < 0.02). 

The importance of STDs as cofactors was further emphasized among sexual 
couples in general population surveys. In studies in Rwanda (156) and 
Kinshasa (125), seropositivity was strongly associated with history of STDs 
in both men and women. More recently, several US studies have found that a 
positive serologic test for syphilis (133, 134) and seropositivity to herpes 
simplex virus type II (31), which is the predominant cause of genital herpes, 
were strongly associated with HIV infection among women with or without a 
history of IV drug use. Therefore, STDs appear to be intricately linked to HIV 
epidemiology and represent one of the major explanations for the heterosexual 
epidemic in central equatorial Africa, and for the increasing number of 
heterosexual cases in the US. These findings argue strongly for inclusion of 
STD control in AIDS prevention programs. The development of programs 
with an integrated approach to inducing behavioral change, promoting con- 
dom use, and controlling STDs would reduce the infectiousness of HIV 
transmitters (43) and the susceptibility of HIV-exposed persons (122). Limit- 
ing the transmission of HIV infection among women of reproductive age 
would obviously have the same impact on preventing perinatal transmission 
of HIV to infants. 


Parenteral Transmission 


In the US, 9% of children acquired HIV infection by receipt of HIV-contami- 
nated blood transfusions or blood components, such as factor 8 and 9 
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concentrates for hemophilia. Fortunately, with HIV screening of all blood 
donations, this mode of transmission has dramatically decreased. In contrast, 
recent outbreaks of HIV infection in children in the Soviet Union, Romania, 
and in many developing countries in Africa and Latin America emphasize the 
risk of nosocomial transmission and the continued need for blood screening 
and sterilization of medical equipment. For example, among hospitalized 
children less than 24 months old in Zaire, five (31%) of 16 seropositive 
infants born to seronegative mothers had been transfused, compared with 15 
(7%) of 220 seronegative children in the same age group (100). Also, 147 
(14.1%) of 1046 pediatric patients in Kinshasa, Zaire, had a history of blood 
transfusion. Of these pediatric patients, 40 (3.8%) were HIV seropositive, 
and there was a strong dose-response association between blood transfusion 
and HIV seropositivity (58). 


HIV INFECTION AND PREGNANCY 


Studies in Zaire, Zambia, Uganda, Kenya, Haiti, and Malawi have shown 
highest rates of adverse pregnancy outcomes, such as spontaneous abortion, 
stillbirth, prematurity, low birth weight, and neonatal mortality in seroposi- 
tive women compared with seronegative controls (17, 59, 67, 90, 95, 105, 
143). However, the findings have not been consistent and appear to be related 


to the severity of maternal HIV disease. In Haiti, children born to HIV 
seropositive mothers were significantly more likely to be premature, of low 
birth weight, and malnourished at three and six months of age than were 
infants born to HIV negative women (62). In Nairobi (17), the mean birth 
weight of singleton neonates of HIV positive women was significantly lower 
than that of controls (3090 vs. 3220 g, p = 0.005), and birth weight was < 
2500 g in 9% of cases and 3% of controls [odds ratio (OR) 3.0, p. = 0.007]. 
Among neonates of HIV seropositive women, birth weight was less than 2500 
g in 17% if mothers were symptomatic and 6% if mothers were asymptomatic 
(OR 3.4, p = 0.08). In Malawi, the seroprevalence for HIV infection in 461 
consecutive pregnant women was 17.6% (104). The estimated annual in- 
cidence of HIV seroconversion in urban pregnant women was 3-4% per 
annum between 1985 and 1987, and 7—13% between 1987 and 1989. HIV 
infection was significantly associated with a positive syphilis serology and 
correlated with history of STDs, although it was not statistically significant. A 
history of spontaneous abortion was also associated with reactive syphilis 
serology, HIV infection, and history of STDs; in a logistic regression analy- 
sis, HIV infection remained the only significant variable. 

Predicting HIV infection in pregnant women without serologic testing has 
been extremely difficult, even in high prevalence areas. Obstetrical history 
may be a better predictor of HIV infection in women of childbearing age than 
socioeconomic and sexual history parameters, with a strong association be- 
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tween intrauterine fetal death and maternal HIV infection in case-controlled 
studies performed in Nairobi (152) and in Kigali, Rwanda. The rates of 
prematurity, low birth weight, congenital malformations, and neonatal 
mortality and socioeconomic statistics were comparable in the two groups 
(95). However, infants of HIV positive mothers were a mean birth weight of 
130 g lower than the infants of HIV negative mothers (p<0.01). 

Because HIV infection in women may be associated with behavioral attri- 
butes, such as alcohol consumption, smoking, illicit drug use, or coinfection 
with such STDs as syphilis or bacterial vaginosis, which may also lead to low 
birth weight or premature birth, it is important to control for these potentially 
confounding factors. In the European prospective studies (10, 51), lower birth 
weight was not related to HIV infection in the child, but to maternal IV drug 
use during pregnancy. It is not possible to ascertain whether the adverse 
pregnancy outcomes reported are the direct consequence of maternal infection 
or caused by fetal infection (110). In other European and US studies, HIV 
infection has not been associated with adverse pregnancy outcomes (80, 147). 
In a study of 39 seropositive and 58 seronegative pregnant women enrolled in 
a methadone program in New York, there were no differences in the frequen- 
cy of spontaneous or elective abortions, ectopic pregnancies, preterm deliv- 
ery, stillbirth, low birth weight, or antenatal, intrapartum or perinatal com- 
plications (147). In Germany, Lutz et al (99) reported no difference in 
pregnancy complications or proportion of low birth weight children in HIV 
positive women with severe lymphocyte depletion. 

Investigators have also suggested that pregnancy may accelerate the course 
of HIV infection, but in more recent prospective studies, in which pregnant 
and nonpregnant infected IV drug users were compared, there was no differ- 
ence in the progression of HIV disease over a three-year period (9, 153). The 
appearance of p24 antigen during pregnancy was transient and not an in- 
dicator of disease progression (9). Further information is needed, especially 
for symptomatic women and those without IV drug use. 


VERTICAL TRANSMISSION OF HIV 


Like other vertically transmitted viral diseases, HIV infection occurs in only a 
portion of children born to HIV infected women. The observed rate of vertical 
transmission has varied widely among prospective studies conducted in the 
United States, Europe, Africa, and Haiti (Table 2) (3, 10, 50, 55, 62, 74, 76, 
108, 143). Not only have rates varied considerably among different locations, 
but longitudinal studies conducted by the same investigators in the same 
populations have, in general, reported declining vertical transmission rates 
over time (50, 51). 


The reasons for the wide geographic and temporal variation in reported 
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Table 2 Vertical HIV transmission rates in selected loca- 
tions 








Location Rate Reference 





Zaire 39% 143 
France 35% 108 
Italy 33% 76 
Miami, Florida 82 30% 74 
New York City 55 29% 62 
Haiti 0 25% 55 
New Haven, Connecticut 24% 50 
Western Europe ‘ 13% 10 
Edinburgh, Scotland 28 7% 3 





vertical transmission rates are not known. The wide confidence limits sur- 
rounding the means in those studies with small numbers may account for 
some of the differences, as may the different case definitions and other 
methodologic variations. Some investigators believe that the observed decline 
in vertical transmission rates in some locations over time may reflect the 
increasing efficiency of detecting asymptomatically infected pregnant women 
via obstetrical screening programs. Furthermore, the duration of the HIV 


epidemic undoubtedly differs from one location to the next. If women with 
more advanced HIV disease are more likely to transmit infection to their 
newborns (vide infra), then higher transmission rates in some locations may 
reflect a longer duration of the HIV epidemic among women of child bearing 
age. 


Factors that May Affect the Risk of Transmission 


MATERNAL FACTORS One of the most important determinants of newborn 
HIV infection may be the stage of maternal infection. To date, this has been 
difficult to assess in our domestic maternal population, because about 80% of 
HIV infected pregnant women followed in US studies have been asymptoma- 
tic (60). However, reports from France and central Africa indicate a much 
higher rate of transmission of HIV from women with advanced symptoms 
than among women who are asymptomatic or have only mild symptoms of 
HIV infection (13, 143, 157). Two of these studies (13, 157) found a marked 
increase in risk of newborn infection when maternal CD4+ cell counts are 
less than 150/mm?. Individuals with advanced HIV infection and low CD4+ 
cell counts are more likely to have higher concentrations of virus in blood and 
other tissues (41, 68). Not surprisingly, several markers for progressive 
maternal HIV infection correlate with an increased risk of perinatal infection, 
including p24 antigenemia (13), HIV viremia (13), and serum IgA concentra- 
tion (75). 
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Older women also appear more likely to have HIV-infected infants than 
younger women (75). The potential influence of other STDs, particularly 
those that cause genital ulceration, is currently under investigation. To date, 
there is little information on whether rates of vertical transmission vary among 
women with different risk factors for HIV infection. 


GESTATIONAL AGE Investigators at the National Institute of Child Health 
and Development and a consortium of New York City hospitals have reported 
that infants delivered before 37 weeks’ gestation have a 60% risk of infection, 
compared with an infection rate of 22% for term infants (74). However, Hutto 
and colleagues (74) subsequently reported no increased relative risk of infec- 
tion for premature infants in a cohort of 82 cases prospectively studied in 
Miami. Clearly, conclusions regarding the influence of premature delivery on 
risk of newborn infection must wait the results of additional studies. 


MATERNAL ANTI-HIV ANTIBODY Since 1989, several groups of in- 
vestigators have suggested that maternal antibody to the HIV viral envelope 
protein gp120 protects the newborn from infection (46, 55, 141). Two of 
these groups reported that the protective maternal antibodies are directed 
against epitopes on, or adjacent to, the immunodominant V3 loop, which is 
the principal neutralizing domain of the gp 120 protein (46, 141). If these 
intriguing observations are confirmed by others, then it may be possible to 
predict which pregnancies are likely to produce an infected infant, and 
vertical transmission might be prevented by immunomodulation. However, 
Goldstein et al (56) have reported that the presence of maternal HIV neutraliz- 
ing antibody did not affect the rate of transmission. Experienced clinical 
scientists are reserving judgment until more complete data are available. 


Unfortunately, a wider discussion of this topic exceeds the scope of this 
review. 


Mechanisms of Vertical HIV Transmission 


Interest has focused on three possible routes by which HIV is transmitted from 
a pregnant woman to her fetus or newborn infant: intrauterine (or trans- 
placental) transmission, intrapartum transmission, and postpartum transmis- 
sion via breast feeding. The strength and nature of the evidence supporting 
each of these mechanisms varies, and there are no data to indicate the relative 
contribution of each potential route of transmission. 


INTRAUTERINE TRANSMISSION There is little doubt that at least some 
infants are infected in utero early in gestation. HIV has been recovered in cell 
culture from the tissues of fetuses aborted between 12 and 20 weeks’ gestation 
(47, 81, 83, 149). The virus has also been detected by in situ cDNA hy- 
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bridization in the peripheral blood mononuclear cells of a one-day-old infant 
(65), and by polymerase chain reaction (PCR) in several infants within a few 
days of birth (93, 163). 

Because not all infants of HIV infected pregnant women become infected, 
considerable attention has been paid to the role of the human placenta in either 
facilitating or inhibiting viral passage from the maternal circulation to the fetal 
circulation. The placenta contains several cell types that may support HIV 
replication. The HIV cell membrane receptor protein, CD4, has been identi- 
fied on the surface of trophoblast cells and on stromal macrophages from both 
first trimester and term normal human placentas (2, 56, 103). Maury et al 
(103) have reported that monoclonal antibodies also identify CD4 antigen on 
endothelial cells of villous capillaries. 

Studies of term placentas from HIV seropositive pregnancies have pro- 
duced contradictory data. Chandwani and colleagues (33) found HIV p24 
antigen within rare trophoblast cells in only two of 41 placentas of HIV 
seropositive women, but observed no staining within villous macrophages. In 
contrast, other investigators have noted HIV p24 antigen to be predominantly 
within villous stromal macrophages (78, 102). In the only reported study of 
preterm placental tissue from HIV seropositive pregnancies, Lewis and col- 
leagues (97) identified HIV gp41 antigen by immunoperoxidase staining and 
HIV DNA by in situ hybridization in trophoblast cells and within chorionic 
macrophages of placental tissue obtained at eight weeks’ gestation. 

The numerous macrophages (Hofbauer cells) distributed throughout the 
villous stroma in the human placenta are capable phagocytes (161), can be 
activated by gamma interferon (162), and produce interleukin-1 (52). Many 
of these placental macrophages express membrane CD4 antigen throughout 
pregnancy (56). The role that these cells play in vertical transmission is not 
known. In some model systems, the placental macrophage is associated with 
protection of the fetus from viral infection (106). In human HIV infection, the 
macrophage may protect the fetus, may serve as the mechanism by which the 
placental barrier is breached, or both. 


INTRAPARTUM TRANSMISSION _ The role of intrapartum (i.e. during labor or 
delivery) transmission of HIV is less settled. The majority of infants of HIV 
seropositive women escape infection in utero. It is entirely plausible, if not 
likely, that some of these infants are infected at the time of delivery as a result 
of contact with maternal blood or genital tract secretions. Documenting 
intrapartum transmission is difficult, however. The best evidence, albeit 
indirect, comes from observations that some HIV infected infants test nega- 
tive at birth or during the first few weeks of life, by sensitive and specific 
assays, such as HIV culture (54, 85), PCR (140), and anti-HIV serum IgA 
testing (159). 
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If the intrapartum route proves to be an important mechanism for vertically 
acquired HIV infection, it is possible that the postnatal natural history of HIV 
infection will be different for infants infected perinatally, compared with 
infants infected in utero. In fact, survival! curves for infected infants appear to 
have a bimodal distribution (145). There are also implications for possible 
prevention of vertical HIV transmission or altering the course of intrapartum 
acquired infection, i.e. an antiviral agent, such as zidovudine or di- 
deoxyinosine, or HIV immune globulin administered to either the mother or 
the infant in the perinatal period couid prevent neonatal HIV infection, just as 
intrapartum hepatitis B infection can be prevented (30). 

A large clinical trial of zidovudine administration to HIV-infected pregnant 
women and their newborn infants has been initiated by the AIDS Clinical 
Trials Group to test this hypothesis. In this multicenter, randomized trial, 
pregnant HIV-infected women receive either zidovudine or placebo beginning 
as early as 14 weeks’ gestation, and their newborn infants receive the same 
preparation as their mothers for six weeks after delivery. Although phase I 
trials in both pregnant women and newborn infants have produced preliminary 
evidence of zidovudine safety in these populations, side effects will also be 
closely monitored during the efficacy trial. 


POSTNATAL TRANSMISSION — There are several reported cases of postnatal 
HIV infection of infants who appear to have acquired HIV infection via breast 
feeding from their postnatally infected mothers (115). HIV has been isolated 
from the breast milk of healthy seropositive women (154). Furthermore, a 
prospective French study suggests an increased risk of infection among 
infants of seropositive women; however, the number of infants at risk was 
small (10). In contrast, several large studies have not demonstrated an in- 
creased risk among children born to HIV seropositive mothers who breast 
feed their infants (N. Halsey 1991, personal communication; 63, 100, 107, 
150). The level of infectivity of breast milk has yet to be established. As noted 
above, one group reports the successful culture of HIV from cell-free extracts 
of breast milk (154), but other investigators have repeatedly failed in their 
attempts to culture virus from breast milk samples obtained from seropositive 
mothers. With the exception of breast feeding, it is unlikely that infants are at 
risk of HIV infection from postnatal maternal exposure. 


CLINICAL PRESENTATION 


The incubation period, or the time between infection and development of 
AIDS, varies considerably among perinatally infected infants. The vast ma- 
jority of HIV-infected infants are asymptomatic at the time of birth. Although 
some HIV infected children may remain minimally symptomatic for several 
years, the median age at AIDS diagnosis is 12 months (114). 
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Because relatively few large pediatric cohorts have been prospectively 
followed, the spectrum of HIV manifestations in children is still being defined 
(50, 145). Early signs and symptoms, such as generalized lymphadenopathy, 
hepatosplenomegaly, and failure to thrive, are relatively nonspecific. As in 
adults, progression of HIV infection typically involves multiple organ sys- 
tems. 

HIV is a neutrotropic virus, and some degree of neurologic dysfunction 
develops in the majority of infected infants and children (17). Static encepha- 
lopathy, detected in approximately 25% of HIV-1 infected children, is man- 
ifested by nonprogressive cognitive and motor deficits of varying severity (6, 
17). Children may also demonstrate a steady decline in language and motor 
and adaptive skills, which is consistent with a progressive encephalopathy. 
Computed tomographic examinations typically demonstrate cerebral atrophy, 
increased ventricular size, calcification of the basal ganglia, and decreased 
attenuation in the white matter. The majority of neurologic abnormalities in 
HIV-infected children appear to be caused by direct effects of the virus itself. 
Although central nervous system lymphomas and opportunistic infections are 
not infrequent among adult patients with AIDS, they are relatively rare among 
pediatric patients (17). 

Acute or chronic pulmonary disease develops in approximately 80% of 
HIV-infected children (40, 145, 158). Acute pulmonary disease, which is 
generally due to bacterial, viral, or Pneumocystis carinii infections, is dis- 
cussed below. Chronic pulmonary disease involving a spectrum of lymphoid 
lesions is common among HIV-infected children. Focal lymphocytic infiltra- 
tion is seen in some children, whereas the diffuse lymphocytic infiltration of 
alveolar septae characteristic of lymphoid interstitial pneumonitis (LIP) devel- 
ops in others. Although LIP rarely develops in HIV-infected adults, it has 
been reported in approximately 40% of children with perinatally acquired 
HIV infection (40). 

In HIV-infected children over one year of age, LIP generally presents as an 
asymptomatic pulmonary infiltration. Clinical symptoms, including cough, 
tachypnea, wheezing, and hypoxemia, develop gradually. Chest radiographs 
show persistent or progressive bilateral diffuse reticulonodular infiltrates 
unresponsive to antimicrobial therapy. Although the combination of an in- 
dolent clinical course and typical radiographic findings in an older child may 
be highly suggestive of LIP, the definitive diagnosis can only be made by 
biopsy. 

HIV-infected children may have abnormalities of numerous other organ 
systems. Cardiac abnormalities, detected by echocardiography in 62-93% of 
infected children, include pericardial effusion, dilated cardiomyopathy, and 
left and right ventricular dysfunction (84, 98). Electrocardiographic abnor- 
malities include ventricular hypertrophy, nonspecific ST-T changes, pro- 
longed QT interval, and arrhythmias. The etiology of the cardiac abnor- 
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malities has not been determined; several’factors, including infection by HIV 
and other pathogens and immunologic or nutritional abnormalities, may be 
involved (84). The clinical manifestations of cardiovascular disease may be 
difficult to interpret in the setting of multisystem disease, and early signs of 
myocardial dysfunction may be erroneously attributed to infection. 

Gastrointestinal dysfunction is common and may involve any region of the 
digestive tract. Diarrhea and failure to thrive appear to be the most prevalent 
clinical findings in the pediatric population (48, 124). Enteric parasites, such 
as cryptosporidium and Giardia lamblia, bacteria, such as Salmonella, 
Shigella, and Mycobacterium avium intracellulare, and viruses, such as 
cytomegalovirus, have all been detected. Noninfectious causes of gastrointes- 
tinal symptoms include carbohydrate, protein, and fat malabsorption (124). 

Nephropathy has been detected in 29% of perinatally infected children 
(119). Renal manifestations include nephrotic syndrome, acute nephritic 
syndrome, renal tubular dysfunction, and acute renal failure (144). 

Hematologic abnormalities include normochromic normocytic anemia, 
granulocytopenia, and thrombocytopenia (19). Lymphopenia, frequently 
observed in HIV-infected adults, occurs far less often in infected children. 
Malignancies also appear to occur less frequently in HIV-infected children; 
however, as children receiving antiretroviral therapy survive for longer per- 
iods, the risk of malignancy may increase. 

Both infectious and noninfectious skin disorders are very common among 
HIV-infected children (132). Thrush, monilial diaper rash, and atopic derma- 
titis tend to be more severe and and refractory to therapy in HIV-infected 
children. Other common diseases may present with unusual lesions, organ- 
isms, or clinical course. Chronic varicella-zoster infection with atypical 
lesions may require biopsy or culture to establish the diagnosis (82, 116). 
Seborrheic dermatitis and Kaposi’s sarcoma, both frequently detected in 
HIV-infected adults, are much less common among infected children 
(132). 

Abnormalities of the immune system characterize HIV infections. Early in 
the disease, most children with perinatally acquired HIV demonstrate B-cell 
dysfunction, with relative sparing of T-cell function (8, 118, 151). This 
presentation, which differs from that in HIV-infected adults and older chil- 
dren, is probably due to acquisition of HIV during the development of the 
immune system. Among the initial manifestations of B-cell dysfunction is 
elevation of one or all immunoglobulins; hypergammaglobulinemia may be 
one of the first indicators that a child has acquired HIV infection perinatally. 
Despite the high levels of immunoglobulins, HIV-infected children demon- 
strate significant deficiencies in their ability to mount appropriate antibody 
responses to specific antigens (8, 12). B-cell dysfunction may also result in 
increased production of autoantibodies, which may mediate some of the renal, 
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cardiac, and hematologic abnormalities found with HIV infection. T-cell 
abnormalities may be less pronounced than those in adults; children are less 
likely to have lymphopenia or profound T helper cell (CD4) depletion. 
However, because normal newborns and young infants have a striking lym- 
phocytosis, CD4 numbers in HIV-infected children, which appear normal by 
adult standards, may represent significant depletion. As discussed below, 
recommendations regarding prophylaxis for Pneumocystis carinii pneumonia 
(PCP) have recently been revised, considering these age-related differences. 

A variety of infections are likely to develop in HIV-infected children during 
the course of their disease. The timing of HIV infection to some extent 
influences the types of other infections acquired by the host. HIV appears to 
interfere with antibody responses to antigens encountered after acquisition of 
HIV. The lymphocytes of children who acquire HIV infection perinatally will 
not have been primed to large numbers of antigens before infection with HIV. 
This immunologic naivete, combined with HIV-induced suppression of the 
humoral immune system, significantly increases the susceptibility of these 
children to bacterial infections (7, 12, 121). Of bacterial diseases, HIV- 
infected children are most likely to have bacteremia and sepsis, pneumonia, 
gastroenteritis, urinary tract infections, sinusitis, and recurrent otitis media. 
Bacteremia is most often due to Streptococcus pneumoniae, followed by 
Haemophilus influenzae type B, enterococcus, group B Streptococcus and 
Salmonella, and other gram negative enteric species (7, 86, 121). Bacteremia 
due to Staphlococcus aureus and Staphlococcus epidermidis, generally 
associated with cathether or wound infections, has also been reported. Pul- 
monary pathogens include S. pneumoniae, Pseudomonas aeruginosa, S. 
aureus, Klebsiella pneumoniae, and H. influenzae, and less commonly Sal- 
monella, Nocardia, Listeria, and Legionella (121). 

Opportunistic infections due to pathogens associated with defects in cell- 
mediated immunity also develop in HIV-infected children (64). Pneumocystis 
carinii pneumonia develops in approximately 50% of children with AIDS 
(70). Although PCP in adults is due to reactivation of a previously acquired 
infection, it more likely represents a primary infection in infants and children. 
The risk of pneumocystis appears to be age-related; children less than one 
year old are much more likely to have PCP than are older children (91, 139, 
145). In many young infants, PCP is the first manifestation of HIV infection. 
Although numerous studies have documented an association between low 
CD4 cell numbers and PCP in HIV-infected adults, the CD4 count is not a 
reliable predictor of pneumocystis infection in children (22, 70, 145). Clini- 
cally, PCP is characterized by tachypnea, dyspnea, cough, and fever associ- 
ated with hypoxemia. The onset of PCP is generally acute with a fairly rapid 
progression, particularly in infants; however, a more insidious presentation 
has also been observed. The chest radiographic typically shows bilateral 
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diffuse interstitial infiltrates without hilar lymphadenopathy, although a vari- 
ety of other findings have been described. 

Mycobacteria are common opportunistic pathogens in HIV-infected in- 
dividuals. The number of tuberculosis cases is increasing with the AIDS 
epidemic (66). Although the majority of cases have been reported in adults, it 
is likely that tuberculosis due to both greater exposure and immune dysfunc- 
tion will increasingly develop in HIV-infected children. Disseminated infec- 
tion with M. avium-intracellulare complex (MAC) organisms is frequently 
detected in HIV-infected adults and children (49, 72). Multiple organ involve- 
ment and persistent bacteremia are typically present. 

Fungal infections are also significant causes of morbidity and mortality in 
HIV-infected children. Mucocutaneous candidiasis with oropharyngeal and 
esophageal involvement is by far the most common fungal infection; sur- 
prisingly, disseminated candidiasis rarely occurs in patients with AIDS. Other 
fungi, such as Cryptocococcus neoformans, which frequently cause systemic 
infections in HIV-infected adults, are unusual in children. The most common 
opportunistic viral. pathogens in HIV-infected children are herpes simplex 
virus, varicella-zoster virus, and cytomegalovirus. 


Laboratory Diagnosis of HIV Infection in Infants 


Early diagnosis of perinatally acquired HIV infection is needed to identify 
infants that might benefit from early antiviral therapy and prophylactic treat- 
ment for opportunistic infections, and to determine the timing of transmission 
from mother to infant. Early diagnosis would aid parents and other caretakers 
of these children who want to know the status of HIV infection in their 
children as soon as possible. Furthermore, natural history data suggest that the 
time scale for disease progression in children is compressed compared with 
adults. From a cohort of 172 perinatally infected infants, 25% of the cohort 
died by two years of age (145). Pneumocystis carinii pneumonia occurred in 
over 10% of the median age of five months and was associated with a median 
survival of one month. Thus, for some infants the window of opportunity for 
intervention between laboratory diagnosis and the development of symptoms 
is narrow. 

Diagnosis of HIV infection in infants during the first year of life is, 
however, problematic. The clinical manifestations of HIV infection in chil- 
dren are varied and nonspecific, including chronic pneumonitis, failure to 
thrive, hepatosplenomegaly, thrombocytopenia, and chronic diarrhea. Di- 
agnosis of HIV infection is more difficult in infants because the current testing 
for evidence of infection depends on serologic confirmation of the presence of 
IgG antibody to specific viral proteins of HIV. However, all infants passively 
acquire maternal IgG antibodies in utero, which can persist for 15 months (51, 
79). Serum tests for IgG antibody, therefore, do not differentiate between 
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infant and maternal antibody; thus, a positive IgG HIV antibody test in an 
infant only indicates exposure. Consequently, in children less than 15 
months, documentation of HIV infection requires a more thorough investiga- 
tion of the immune system with CD4 lymphocyte determination, exclusion of 
congenital immunodeficiency, and identification of viral components (such as 
p24 HIV antigen), HIV culture, or PCR. Alternative tests in the process of 
evaluation include assays for neonatal IgM and IgA and in vitro assays to 
determine the ability of neonatal peripheral blood mononuclear cells to secrete 
HIV-specific IgG antibody (155). 

The current gold standard for establishing HIV infection in neonates is 
recovery of the virus from the infant by culture. Although this is a highly 
specific assay, its sensitivity varies and may be as low as 50% in HIV-infected 
infants during their first few weeks of life because of the low viral load (89). 
Culture has limited use as a diagnostic test for HIV. Cultures typically take 
7-28 days or more to complete and require special biosafety precautions to 
prevent exposure to laboratory personnel. Cultures are costly, labor intensive, 
and not practical for resource-poor settings. The sensitivity of virus culture for 
detecting HIV infection also varies among laboratories and during the course 
of illness. 

DNA amplification by PCR offers several advantages over culture. Be- 
cause PCR detects the presence of the virus, rather than antibody to the virus, 
it avoids the problem of persistent maternal antibody (73). Polymerase chain 
reaction requires a small amount of blood and can be performed within 24 
hours. Although sensitivity appears to be improved compared with culture, it 
is also limited during the early neonatal period. For example, in a study of 
infants later defined as being HIV-infected, only eight of 20 had detectable 
proviral sequences by PCR in the neonatal period within the first week of life 
(140). Of the 11 infants who had CDC-defined AIDS in the first 18 months, 
seven were positive for PCR in the neonatal period, compared with one of 
nine other infants who have other HIV-related clinical signs and symptoms, 
thus suggesting a prognostic role for PCR. All 22 HIV-infected infants tested 
in the postnatal period were PCR positive; of these, 20 were PCR positive by 
six months of age. Some infected infants test negative in the neonatal period, 
possibly because of infection during late gestation or in the intrapartum 
period, and their level of virus is below detectable levels for the test. 

The specificity of the assay in this particular study was excellent (140). 
None of 93 infants who had lost maternal antibody and remained antibody 
negative were repeatedly PCR positive. Three of 93 infants tested PCR 
positive on one occasion; however, subsequent PCR testings were negative, 
and these children remained clinically well. As with culture, there are several 
limitations with this assay besides its low sensitivity in the neonatal period. 
Currently, the test is not completely standardized, and different laboratories 
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report varying sensitivities and specificities with known samples. Finally, 
PCR is not widely available in a diagnostic format and still may be impractical 
for developing countries. However, it is hoped that soon the technology will 
be inexpensively exportable to the developing world. Paterlini et al (120) 
detected HIV DNA sequences in 20 (64%) of 31 babies born to seropositive 
mothers in Kinshasa. Clearly, this high rate needs confirmation and may 
represent falsely high rates because of contamination, a particular problem in 
developing countries. 

Serologic assays, which are less expensive and better standardized, offer 
yet another alternative to early diagnosis. Because IgM and IgA antibodies do 
not cross the placenta, assays for the measurement of these antibodies to HIV 
have been developed. IgM assays have lacked sensitivity and specificity 
because of cross reaction with rheumatoid factor and the transient nature of 
IgM antibodies (160). However, the sensitivity of IgA detection increases by 
removal of IgG, which competes for antigen-binding sites. In a preliminary 
study, Weiblen et al (159) demonstrated IgA antibodies in 12 of 18 samples 
from HIV-infected infants aged six to 12 months, five of 10 of infants aged 
three to five months, and two of 13 of infants under three months. More 
promising results were recently reported in another study, in which IgA 
antibody was detected in eight of nine infected infants by 12 weeks of age 
(101). 

Another serologic assay is the p24 antigen assay. Available HIV antigen 
detection kits fail to detect serum p24 antigen in the presence of high titers of 
HIV-specific antibodies. Studies of infants born to HIV-infected mothers 
have found very few infants to be antigen-positive early in the course of 
infection because of the presence of excess maternal antibody (15). As 
maternal IgG antibody declines, infant p24 antigen may be transiently 
measurable, although some of this may be bound by infant IgG antibody, 
which increases in titer with age. 

Other techniques for perinatal diagnosis include the in vitro antibody 
production assay (IVAP) (117) and the ELISPOT (94), which detect the 
presence of antibody-producing B-lymphocytes, not the antibody itself, thus 
avoiding the problem of persistent maternal antibody. Preliminary data indi- 
cate that the IVAP test and ELISPOT, although sensitive, were neither 
specific nor predictive for HIV infection during the first few months of life. 
These data, like many of the other assays, confirm the difficulty with early 
diagnosis during the first few months of life, but reliability of the assays 
appears to improve after three to six months of age. 


Clinical Diagnosis 


The Centers for Disease Control has developed a definition and classification 
system of HIV infection in children less than 13 years of age (31). However, 
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in developing countries, limited diagnostic capabilities have precluded routine 
use of the CDC criteria. In these areas, diagnosis relies upon the provisional 
pediatric clinical case definition of AIDS developed by WHO (164). 
Although the WHO case definition appears to be fairly specific, it lacks 
sensitivity and positive predictive value (137). HIV-infected children who die 
acutely with an overwhelming infection may be missed by this case defini- 
tion, which emphasizes chronic signs and symptoms. In addition, the broad 
spectrum of disease associated with HIV in children, and the overlap with 
other common diseases in developing countries, hinders the diagnosis of 
AIDS based solely on clinical criteria. Thus, in countries with limited di- 
agnostic resources, establishing the diagnosis and determining the natural 
history of HIV infection will continue to be a significant problem. 


CLINICAL MANAGEMENT 


Clinical management of an HIV-infected child should include both specific 
antiviral therapy and aggressive diagnosis and treatment of associated in- 
fectious and noninfectious conditions. Several therapeutic agents with activity 
against HIV, including the dideoxynucleosides, azidothymidine (zidovudine, 
AZT), dideoxycytidine (ddC), and dideoxyinosine (ddI), are being evaluated 
in both pediatric and adult patients. By inhibiting reverse transcriptase, these 
drugs interfere with HIV replication. In a recently completed phase II study, 
children receiving zidovudine showed an improvement in weight gain and 
cognitive function; their serum and cerebrospinal fluid p24 antigen levels 
declined, and their CD4 cell counts transiently improved (104). Studies 
currently underway include one in which two doses of zidovudine are being 
compared in less symptomatic children and one involving the coadministra- 
tion of IVIG or placebo. Smaller studies, which use ddl, ddC, or soluble 
CD4, are also being conducted. 

Zidovudine is currently the only agent approved for use in children. The 
standard dose is 180 mg/square meter given every six hours. The most 
common toxicity is bone marrow suppression with anemia and, less often, 
neutropenia, both of which generally respond to dose reduction or temporary 
discontinuation of the medication (104). Children who cannot tolerate zidovu- 
dine or who have progressive disease while on therapy can be considered for 
trials that use other antiretroviral agents. 

Opportunistic and other serious infections are a major cause of morbidity 
and mortality among HIV-infected children. Therefore, early detection and 
treatment of these infections is critical. It is important to remember that 
common childhood pathogens are also likely to be problems in HIV-infected 
children; however, the diagnosis and management of these diseases may be 
complicated by an atypical presentation or progression. An HIV-infected 
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child who has an acute febrile illness should be aggressively evaluated to 
determine the site of infection and potential etiologic agent. Because in- 
fections may rapidly progress and the social situation is often not optimal, the 
decision to observe a febrile HIV-infected child at home should be made with 
caution. 

The presentation of an HIV-infected child with fever, cough, and dyspnea 
should prompt consideration of PCP, as well as other pulmonary infections. 
Differentiation of PCP from other infections of LIP by radiograph may be 
very difficult, and definitive diagnosis requires demonstration of P. carinii in 
specimens generally obtained by bronchoscopy or open lung biopsy. Tissue 
specimens are particularly useful to determine the presence of concurrent 
infections with cytomegalovirus or other organisms. Induced sputum, often 
used to establish the diagnosis of PCP in adults, cannot reliably be obtained in 
very young children. Because untreated PCP is associated with high mortality 
rates, therapy should be started quickly; it can be started presumptively in 
very ill children or in situations in which the diagnostic workup is likely to be 
delayed. The therapy of choice is intravenous trimethoprim-sulfamethoxa- 
zole; although associated with a relatively high incidence of adverse effects in 
adults with AIDS, fewer data regarding toxicity in pediatric patients are 
available (57). Patients who fail to respond clinically after five to seven days 
of therapy or who cannot tolerate trimethoprim-sulfamethoxazole should be 
treated with pentamidine isethionate. Although both drugs are equally effica- 
cious, pentamidine is associated with a higher incidence of serious adverse 
effects. Therapy of PCP is generally continued for two to three weeks; the 
high rate of recurrence in HIV-infected patients indicates the need for subse- 
quent prophylaxis. 

The use of corticosteroids should be considered as adjunctive therapy. 
Studies of adults with AIDS have indicated that the early use of steroids 
reduces the risk of respiratory failure and improves survival (15, 53). 
Although no data are yet available regarding the use of corticosteroids in 
HIV-infected children, many centers are now using them in children with 
moderate to severe PCP. 

The high mortality rates associated with PCP among HIV-infected infants 
and children warrant aggressive use of chemoprophylaxis. In children with 
perinatally acquired HIV infection, the risk of PCP is greatest during the first 
year of life; therefore, prophylaxis often needs to be instituted before a 
definitive diagnosis of HIV infection has been established. The Working 
Group on PCP Prophylaxis in Children recently issued guidelines for initia- 
tion of PCP prophylaxis for HIV-infected children (22). Trimethoprim- 
sulfamethoxazole, administered three times per week, is the recommended 
regimen. Although few data are available regarding efficacy in children, 
aerosolized pentamidine can be used for PCP prophylaxis in children aged 
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five years or older (22). Intravenous pentamidine or dapsone are also being 
used prophylactically in some children. 

The management of other opportunistic infections is problematic. The 
diagnosis of MAC infection is generally established by blood mycobacterial 
culture or histopathology of biopsy specimens. Although numerous drug 
regimens have been examined, no effective therapy for MAC infections is 
currently available; new drugs with increased in vitro activity against MAC 
are now being examined in adults and children. Oral candidiasis can be 
diagnosed clinically, whereas the definitive diagnosis of esophageal candi- 
diasis requires culture and histopathology of specimens obtained at endo- 
scopy. Nystatin or clotrimazole are often adequate treatment for orophary- 
ngeal candidiasis, whereas esophageal candidiasis requires the use of ketoco- 
nazole, or less often, amphotericin B. The diagnosis of cytomegalovirus 
(CMV) infections, other than retinitis, generally requires histopathologic 
evidence of invasive disease and isolation of the virus. Ganciclovir, an 
analogue of acyclovir, has been used in small numbers of HIV-infected 
children with severe CMV-related disease (18). There are no data currently 
available regarding use of other therapeutic agents, such as foscarnet, high 
dose acyclovir, CMV hyperimmuneglobulin, or interferon, in HIV-infected 
children. 

The management of other HIV-related noninfectious conditions is largely 
supportive. Initial management of an asymptomatic child with LIP should 
include aggressive therapy of intercurrent pulmonary infections; appropriate 
use of influenza, pneumococcal, and Haemophilus influenzae vaccines; and 
close monitoring for progression of LIP. Children with significant hypoxemia 
may require supplemental oxygen. Several investigators have advocated the 
use of a 4-12 week course of prednisone in children with PaO, less than 65 
mmHg (40, 113). 

Other organ systems should be carefully evaluated in HIV-infected chil- 
dren. These children should undergo routine cardiovascular screening with 
electrocardiography and echocardiography at six-month intervals. Patients 
with significant gastrointestinal dysfunction may become malnourished and 
rapidly deteriorate clinically; therefore, reversible causes of gastrointestinal 
disease, such as infectious diarrhea, should be aggressively sought, and 
nutritional support should be implemented early in the course of HIV infec- 
tion. Renal function should be monitored, urinary tract infections should be 
promptly treated, and potentially nephrotoxic drugs should be used cautious- 
ly. If drug toxicity, acute tubular necrosis, or other potentially reversible 
conditions precipitate acute renal failure in a child who otherwise has a 
reasonable prognosis for short-term survival, acute dialysis should be consid- 
ered (144). Decisions regarding dialysis for irreversible renal failure are made 
on an individual basis, depending on the patient’s overall state of health. 
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Although several immunomodulating agents are under investigation in 
HIV-infected adults, relatively few options are available to the pediatric 
patient. A recently completed multicenter placebo-controlled IVIG trial con- 
cluded that monthly IVIG prolonged the time free from serious bacterial 
infections in children with symptomatic HIV infection and CD4 cell counts 
greater than 200 (109). These data have led the American Academy of 
Pediatrics to recommend that pediatricians consider the use of [VIG for their 
patients. 


Prognosis 


Infants with perinatally acquired HIV infection progress clinically and im- 
munologically much more quickly than adults (114, 145). Available data 
suggest that children who have AIDS or become symptomatic during the first 
year of life have median survival times of 6.7 months and 24.8 months, 
respectively (139, 145). Children who have LIP appear to have a more 
favorable prognosis than do children who have opportunistic infections. 
Although the use of zidovudine may somewhat improve their prospects of 
survival, the overall prognosis for HIV-infected children remains bleak. 


PREVENTION 


As the numbers of HIV-infected women continue to increase in this and other 
countries, the specter of perinatal HIV infection also increases. HIV, which 
has emerged as the ninth leading cause of death in infants aged one to four 
years in the United States, has already had a significant impact on infant 
survival (146). The accelerated course of the disease in children and the 
inadequacy of available therapeutic modalities make prevention of infection a 
priority. Preventive efforts include two approaches: prevention of transmis- 
sion to the infant and prevention of infection in women of childbearing age. 

A multicenter study has recently evaluated the safety, tolerance, and 
pharmacokinetics of zidovudine administered to 30 infants born to HIV 
seropositive women. Infants less than 30 days old demonstrated increased 
clearance of zidovudine, and overall the drug was well tolerated (131). The 
next step, a trial designed to determine whether maternal infant transmission 
can be prevented with zidovudine, is being undertaken in several medical 
centers under the auspices of the AIDS Clinical Trials Group. This study 
assumes that a significant portion of HIV transmission occurs around the time 
of delivery and that the use of zidovudine can interrupt such transmission. 
HIV seropositive pregnant women will be randomized to receive either 
zidovudine or placebo during pregnancy and through delivery; infants will be 
treated for six weeks and then followed for 18 months. Data from this trial 
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will not be available for several years. Additional trials with other therapeutic 
agents are being considered. 

Women in the United States are now becoming infected with HIV primarily 
through heterosexual transmission or through intravenous drug use. Because 
attempts to diminish high risk sexual activity or drug use have often been 
ineffective, prevention of transmission has been very difficult. Increasing 
numbers of obstetricians are recognizing their obligation to educate, counsel, 
and screen their patients; however, their ability to alter high risk behavior 
effectively is limited. Intense educational efforts must be undertaken in this 
country; unfortunately, the populations at highest risk are also the most 
difficult to reach. There is now increasing recognition of the urgency to 
educate adolescents, in an attempt to modulate high risk behaviors traditional- 
ly undertaken during adolescence. The difficulties inherent in changing high 
risk behavior of any population should not dissuade public health practition- 
ers. The reality of the AIDS epidemic, with its increasing toll on children, 
should provide sufficient motivation for the development, implementation, 


and careful evaluation of intervention strategies. 
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INTRODUCTION 


An important recent trend in health promotion and disease prevention has 
been the increasing number and scope of community-based interventions. 
These programs are aimed at entire populations, which are usually geographi- 
cally defined, and they attempt to change health behavior and disease risk 
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through mass media campaigns, activation of existing community organiza- 
tions, or changes in the physical or sociocultural environment. Several large 
programs of this kind have been mounted for cardiovascular disease preven- 
tion (30, 33, 44, 56, 71), as reviewed by Shea & Basch (79, 80), and the 
approach is increasingly being applied to other disease areas and populations 
(3, 34, 67, 89, 92). As investment in community-based programs has grown, 
so has the importance of evaluating their effectiveness, as evidenced in part 
by the recent publications of Green & Lewis (38) and Bracht (6). In this 
review, we focus on a selection of methodological issues that assume special 
importance in evaluating community-based programs, but receive little cover- 
age in standard texts on program evaluation. These issues include: 

1. Specification of the theoretical model. The design of an intervention is 
usually based on some theory of program action. An important early step in 
program evaluation is to make this theoretical model explicit: What are the 
key intervention components, and what are the causal mechanisms by which 
they are expected to work? An explicit model is needed to guide evaluation 
design decisions, to help identify the specific shortcomings of a program 
found to be ineffective, or to facilitate dissemination of an effective one. The 
task can be complex for community-level interventions aimed at individual- 
level health behavior because of the need for a multilevel conceptualization. 

2. Communities as units of allocation. Because interventions aim at entire 
communities, an evaluation design with concurrent controls will likely in- 
volve assignment of communities en bloc to intervention and control groups. 
This feature has important implications for both planning study size and data 
analysis. 

3. Allocation of a small number of communities. Cost and feasibility 
considerations usually limit the intervention and evaluation to a small number 
of communities, thus complicating the task of achieving comparable study 
groups. 

4. Longitudinal versus repeated cross-sectional samples. Community sur- 
veys may be needed to measure change in certain key outcomes. These 
surveys can be conducted by either following a panel of individuals in each 
community over time or drawing a fresh cross-sectional sample in each 
community at each time point. Both approaches have unique strengths and 
drawbacks. 

5. Validity of self-reported health characteristics. Particularly because of 
the highly public nature of the intervention and the inability to blind partici- 
pants to their treatment group membership, the validity of self-reported data 
on health behavior can be a concern. 

6. Measures of community environment. Assessing features of the commu- 
nity environment can help test the underlying causal model, detect early 
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program effects, and avoid excessive reliance on self-reported behavior 
change. 


We now discuss each of these six issues in turn. 


SPECIFICATION OF THE THEORETICAL MODEL 


The randomized controlled trial has become a widely accepted paradigm for 
evaluating the effect of health interventions, against which nonexperimental 
methods are judged and often found wanting. The design and the size of most 
randomized trials are usually driven by a primary research question, which 
typically concerns the effect of an intervention on final outcomes. Un- 
fortunately, this focus on final outcomes may result in overlooking the need to 
characterize both the intervention itself and the causal mechanisms by which 
it is supposed to work. Interventions then become “black boxes” whose 
overall effects may be detectable, but whose contents are obscure. Careful 
specification of the intervention and its presumed mechanism of action is an 
important step in designing an appropriate evaluation. 

What are black box interventions? Lipsey (57) describes them as “situations 
for which inputs and outputs can be observed, but the connecting processes 
are not readily visible.” The black box then contains the causal sequence 
between inputs (e.g. receipt of grant funds and formation of a community 
coalition) and outputs (e.g. cessation of cigarette smoking). For simpler 
interventions, such as an immunization program, opening the black box, 
albeit desirable, may not be as essential to interpreting the evaluation results, 
replicating effective interventions, or tinkering with ineffective ones. For 
such interventions as community-based prevention efforts, the contents of the 
black box are much more complex, and their obscurity is a serious deterrent to 
understanding and progress. 

A key reason to open black boxes is to improve interventions. With this in 
mind, an approach to process evaluation based on theoretical considerations 
has emerged in the evaluation literature (13, 14). At the heart of the approach 
is the notion of treatment theory, which describes how program inputs 
translate into outputs. An optimal treatment theory is specific enough to guide 
evaluation design and analysis, yet general enough to illuminate the field. The 
more critical need, however, is for specific applicability to the intervention 
under study and to the context in which it will be implemented. This need has 
led Lipsey (57) to label such intervention theories as “small theories of 
treatment.” Large theories, such as diffusion theory or exchange theory, 
might guide the elaboration of treatment theory, but can be too abstract and 
general to guide evaluation design. 

A useful treatment theory provides a model to show how the program will 
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produce its postulated effects. At minimum, it must include key inputs (e.g. 
formation of a new community coalition) and outputs (e.g. avoidance of 
substance use by adolescents), and the sequence of events of processes 
connecting them. For community-based prevention programs, these events or 
processes must delineate a believable scenario by which the mobilization of 
community organizations and programs can motivate and assist individual 
citizens to change their behaviors. A useful small theory of treatment would 
describe how grant funds, program specifications, technical assistance, and 
other inputs translate into effective community structures that can produce and 
disseminate intervention components with a chance of success. 

A critical aspect of useful treatment theory and process evaluation, in 
general, is the specification of key steps in program implementation (75). For 
most community-based health programs, major concerns include the 
functionality of the community coalition or board, the scientific quality of 
intervention components as actually delivered, and the exposure of communi- 
ty residents to those interventions. 


Treatment Theory and Evaluative Design 


A good treatment theory can greatly enhance the design, analysis, and 
interpretation of an evaluation (5, 57). From a study design perspective, there 
is almost no limit to what can be measured in a community-based program. 
Important events and processes may occur in the community environment or 
among community organizations, political leaders, health care providers, or 
individual members of the target population. Choosing the variables to mea- 
sure requires some means of distinguishing that which is essential for de- 
termining program success or failure from the rest. Program theory provides a 
blueprint for measurement because, by definition, it specifies the critical steps 
on the path from input to output. 

For example, Figure | shows the “small theory” of treatment that guides the 
evaluation of the Henry J. Kaiser Family Foundation’s Community Health 
Promotion Grants Program (89). The conceptual basis for the model (27, 39, 
40), based in social learning theory (a “large theory”) (2), emphasizes mod- 
ifying community norms and inducing changes in the physical, regulatory, 
and socioeconomic environments to make them more supportive of healthful 
behaviors and behavior change. To accomplish this, the model posits that 
projects must first activate their communities by developing a broadly based 
consensus among leading community organizations to address a health prob- 
lem, coordinate planning, share resources, and engender broad citizen in- 
volvement. The “activated community” reaches individual citizens through 
high quality intervention components that change norms toward approval of 
healthful behaviors and disapproval of unhealthful ones (e.g. in media mes- 
sages), change environments to encourage healthful behaviors and discourage 
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Figure 1 Intervention model. 


unhealthful ones (e.g. worksite smoking policies), and provide more models 
of individuals who have adopted healthful norms and behaviors. 

Measurements in this evaluation were then selected to correspond to the 
major components of the treatment theory. A survey of leaders of key 
community organizations provides data to assess the extent to which commu- 
nity organizations were collaborating and generating intervention activities. 
Surveys of restaurants and grocery stores and reviews of legislative activity 
monitor environmental changes, whereas surveys of adult and adolescent 
residents furnish information about exposure to interventions, norms, be- 
havioral models, as well as behaviors. 


Treatment Theory and Data Analysis 


Treatment models are also analytic models, which specify independent, 
dependent, and mediating variables and depict causal pathways. Judd & 
Kenny (47) show how modern multivariate statistical techniques can be used 
to test the relationships posited by a treatment theory. Lipsey (57) argues that 
the use of treatment theory to select appropriate, sensitive outcome measures 
may mitigate the common problem of insufficient statistical power in social 
evaluations by increasing the true effect size associated with effective treat- 
ments. 


Treatment Theory and the Interpretation of Evaluation 
Results 


Community-based prevention programs that address health-related behaviors 
may not always produce dramatic effects. Evaluation findings have often been 
mixed (28), controversial (70), or negative (94) and are likely to continue to 
be so. Treatment theory may clarify the meaning of findings by delineating 
the role of the treatment, or aspects of the treatment, as the cause of a positive 
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or negative result. Concomitantly, treatment theory may play a crucial role in 
disentangling bad evaluation methods from bad treatment ideas from bad 
treatment implementations (57). 


Treatment Theory and the Advancement of 
Treatment Effectiveness 


Most community-based prevention programs resemble each other, at least in 
general ways. Evaluations based on treatment theory should advance the state 
of the art by identifying the details of good ideas for replication or enhance- 
ment and bad ideas for a return trip to the drawing board. 


COMMUNITIES AS UNITS OF ALLOCATION 


Under our definition, community-based interventions are aimed at entire 
communities. Hence, an evaluation that uses a concurrently studied control 
group must generally use entire communities as controls. The unit of alloca- 
tion in this design is thus the community, even though many outcome 
measures (including all those discussed in this section), such as smoking 
status or dietary fat intake, may originate with observations on individuals in 
those communities. Sometimes, communities are actually randomized to 
intervention and control groups (34, 89), as we discuss later. However, 
nonrandomized designs must also deal with the consequences of community- 
level allocation, and they require added attention to the possibility of com- 
munity-level confounding factors. 

Probably, the most important consequences of allocation by community are 
reduced statistical power and added complexity in estimating sample size 
requirements or statistical power. When we have person-level outcome mea- 
sures, but community-level allocation and analysis, two sources of random 
variation must be considered and estimated: individual-level variation within 
a community and community-level variation within a treatment group. We 
must also consider two kinds of sample sizes: the number of individuals per 
community and the number of communities per treatment group. For a fixed 
total number of individuals studied, statistical power is almost always lower 
when allocation is by community (or cluster) rather than by individual, as 
shown in a short and accessible paper by Cornfield (17). At least under 
classical methods of analysis, part of the loss of power occurs because the 
number of degrees of freedom for a statistical test of treatment effect depends 
on the number of communities studied, not on the number of individuals 
studied in those communities (17, 48). When the number of communities is 
small, this number of degrees of freedom is also small, and the critical value 
that a test statistic must achieve is higher than for studies that allocate 
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individuals. This effect on power can grow large when the number of com- 
munities falls below about ten. 

More specifically, the power to detect an effect of the intervention depends 
directly on the precision with which the mean level of the relevant outcome 
can be estimated for each treatment group. For a simple design involving 
randomization of c communities to an intervention group and c more to a 
control group, with n individuals studied per community, the expected sam- 
pling variance of the mean for each treatment group is: 


o2 
oC + — 
n 


’ 


c 


where g¢” is the community-level variance component (i.e. variance in the 
true mean level of the outcome variable among communities) and o7 is the 
individual-level variance component (i.e. variance in the outcome variable 
among individuals within a community). As a rule, the evaluator has little 
control over the size of a-” or o”, but must estimate them both to estimate 
study power. 

The above expression also shows that if ac? is at all large relative to o”, 
there are likely to be only modest gains from studying more individuals per 
community (i.e. increasing n), but potentially major gains in power from 
studying more communities per treatment group (i.e. increasing c). Of 
course, these two options for enhancing power may have quite different cost 
implications. In some situations, the marginal cost of each intervention site 
may be large, but the marginal cost of a control site may be more modest. If 
so, the evaluator may wish to form unequal-sized treatment groups, with more 
control sites than intervention sites. 

An equivalent way of considering this issue (23) is to note that, under 
community allocation, observations on the individuals in each treatment 
group cannot be considered statistically independent of each other, as they can 
under individual allocation. Instead, observations on individuals who reside 
in the same community tend to be correlated. For continuous variables, the 
appropriate measure of correlation is the intraclass correlation, which can be 
expressed as 


Ps 
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The formulation based on correlated observations is thus closely linked to that 
based on variance components, as the intraclass correlation can be viewed as a 
measure of the relative sizes of the two variance components. Mickey et al 
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(61) discuss this issue in terms of the design effect and show how its 
magnitude can depend on study duration. 


Specific Methods for Estimating Sample Size and Power 


Statistical tools useful for study planning have been developed for a variety of 
study designs that involve allocation by community. Donner and colleagues 
(22, 23) and Hsieh (43) provide guidelines for studies of simple, two-group 
comparisons that involve continuous or dichotomous outcome measures. 
Shipley et al (81) describe and illustrate methods for designs involving 
randomization of matched pairs of communities. Hsieh (43) discusses an 
approach for power calculations when communities are to be randomized 
within two or more strata, and when treatment effects are to be measured in 
terms of a pretest/posttest comparison over time. Koepsell et al (48) suggest 
an approach that can be used when the time path of a program effect is of 
central interest, as may be true for an evaluation developed around a specific 
intervention model. They also discuss different approaches for longitudinal 
versus repeated cross-sectional samples of individuals studied over time. 
Earlier work by Gillum et al (35) also considers the problem of allowing for 
dropouts over time. 


Obtaining Estimates of Community-Level Variance 


One of the greatest challenges in estimating power and sample-size require- 
ments in community-based studies is providing estimates of the community- 
level variance component, o¢*. [For a design involving comparison of 
changes over time, the evaluator would instead supply an estimate of ocr’, 
the community-by-time interaction variance, against which treatment-by-time 
interactions would be tested. See Koepsell et al (48).] Depending on the 
outcome variable of interest, suitable estimates can sometimes be derived 
from public data sources, such as the Centers for Disease Control Behavioral 
Risk Factor Survey, or from previous studies. 

Several statistical methods for estimation of variance components have 
been proposed (77). Particularly when the number of individuals studied 
varies across communities, these methods can yield different estimates. As a 
practical matter, both the BMDP and SAS computer packages have pro- 
cedures for computing variance components. In BMDP, procedure P3V 
provides both maximum likelihood or restricted maximum likelihood 
(REML) estimation methods. In SAS, the corresponding procedure is PROC 
VARCOMP. 

For illustration, Table 1 presents estimates of oc” and of o* for current 
smoking status, as derived from three studies that involved data collection 
across several communities: the evaluation of the Kaiser Family Foundation 
Community Health Promotion Grant Program (89), the RAND Health In- 
surance Experiment (64), and a survey of cancer-related risk behaviors con- 
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ducted in Washington State by the Cancer Prevention Research Program at the 
Fred Hutchinson Cancer Research Center (1990, unpublished data). In each 
of those studies, cities or counties were the communities of interest, and 
smoking status was coded as 0=nonsmoker, 100=smoker to yield estimates 
in a convenient numerical range. Within each study, estimates of o¢* 
obtained by the three statistical methods are generally similar, e.g. they range 
from 8.8 to 10.7 for the Kaiser data. However, the point estimates are quite 
different across studies; REML estimates range from 5.4 for the RAND data 
to 30.3 for the Washington State data. Despite the relatively large number of 
individuals studied, estimates of a” from these data sets are based on small 
numbers of communities and thus have rather wide confidence limits. When 
such data are used in study planning, it may be wise to use several estimates 
of o-*, which vary through a plausible range and yield “optimistic” and 
“pessimistic” estimates of sample size or statistical power. 

Particularly for large data sets, the task of computing variance component 
estimates can be time-consuming and costly, and an investigator may lack the 
resources to do so. Sometimes, only published, community-level means or 
prevalences may be available. In these situations, an investigator can obtain a 
crude point estimate of a” by simply computing the variance of the set of 
community-level means or prevalences. On average, such an estimate tends to 


Table 1 Examples of individual- and community-level variance components for current smok- 
ing status 








Washington State 
Kaiser Community RAND Health Cancer-Related 
Health Promotion Insurance Experiment Behavioral Risk 
Grants Program (at entry) Factor Survey 





No. communities 15 6 35 
Total no. individuals 8726 5094 1642 
Prevalence of smoking 24% 37% 26% 
Individual-level 
variance (a7) 
Point estimate 1800.7 2342.3 1990.7 
95% conf. limits (1746.2, 1855.1) (2249.5, 2435.3) (1850.9, 2130.7) 
Community-level 
variance (a¢’) 
REML Method 
Point estimate 10.7 5. 30.3 
95% conf. limits (0, 21.7) ; 36S (0, 79.9) 
ML Method 
Point estimate 8.8 3: 25.1 
95% conf. limits (0, 17.6) P12: (0, 67.2) 
Method of Moments 
Point estimate 10.0 } 20.1 
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be conservative (i.e. too large) and probably still has wide confidence limits if 
based on a small number of communities. But, at least the estimate gives an 
investigator an idea of o¢* for use in study planning. 


Analysis Strategies 


Oft-quoted advice by Cornfield (17) is: “Randomization by cluster accom- 
panied by an analysis appropriate to randomization by individual is an ex- 
ercise in self-deception . . . and should be discouraged.” Whiting-O’ Keefe & 
Simborg (91) have also commented on the all-too-common practice of ignor- 
ing the proper unit of analysis in studies that involve assignment of aggregates 
to treatment conditions. 

Space limitations permit only brief mention of several suitable analysis 
techniques. Randomization tests (65) provide a valid method to test for 
program effects with minimal statistical assumptions. These tests are more 
feasible to implement with small numbers of study units, especially in the 
present era of cheap computing power. However, they have the decided 
disadvantage of never being able to reject the null hypothesis if the number of 
possible assortments of study communities into treatment groups is very 
small. Traditional analysis-of-variance methods for hierarchial (nested) study 
designs may be used and are readily implemented when the number of 
observations per community is relatively constant across communities (25). 
Analysis using community means as though they were elementary observa- 
tions can also be a straightforward and valid approach for such “balanced” 
designs. The above-mentioned BMDP procedure P3V can accommodate 
designs with unequal sample sizes (20). The analysis of variance is most 
applicable for continuous outcome measures, but it may also be suitable for 
dichotomous outcomes if the number of observations per community is 
reasonably large and if community prevalences are not too close to 0 or 1. 

Donald & Donner (21) have suggested a method that accounts for 
randomization by cluster when combining 2 X< 2 contingency tables across 
communities. Donner & Donald (24) have proposed analytic methods when 
randomization by cluster has been carried out within strata. Zeger et al (95) 
have described powerful and flexible analysis methods for correlated di- 
chotomous outcomes by using generalized estimating equations in the context 
of longitudinal studies. Software to implement their approach is not yet 
widely available, however. 


ALLOCATION OF A SMALL NUMBER OF 
COMMUNITIES 


Often, funding agencies or communities themselves decide whether a pro- 
gram is to be mounted in a particular community or set of communities, and 
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an evaluator may have little say in the matter. On other occasions, a multi- 
community program may be set up as a planned social experiment, thus 
allowing evaluation considerations to affect the process by which communi- 
ties are designated as intervention or nonintervention sites. But, even when an 
evaluator has the luxury of allocating communities to treatment groups, it may 


be far from clear how best to do so. Here, we consider two aspects of the 
decision. 


Should Communities Be Randomized? 


When only a few study communities are available to be allocated randomly to 
an intervention and a control group, there is an increased risk of a major 
imbalance between groups on important confounding factors, whether these 
factors are known or unknown. One can argue that some possible outcomes of 
simple randomization would be unacceptable, such as those that put interven- 
tion and control communities into the same media market and lead to cross- 
contamination. For that reason, for example, and to minimize investigator 
travel, communities close to study headquarters are sometimes chosen as 
intervention sites, thus leaving communities farther away as controls. 

Nonetheless, even when only a few communities are available for study, a 
random allocation process has much to recommend it (51), especially when 
processes other than simple random allocation are considered. The difficulty 
of creating acceptably balanced treatment groups results chiefly from the 
limited number of communities available for assignment, and that difficulty 
remains whether randomization is used or not. Other methods for achieving 
balance, such as matching or stratification, can be used in conjunction with 
randomization. In the COMMIT project, for example, 11 pairs of communi- 
ties were formed, and one member of each pair was chosen at random to be 
the intervention site (34); in the Kaiser Health Promotion Evaluation Project, 
a form of restricted randomization was used after study communities were 
arranged into strata (89). Restricted randomization can also be sued to deal 
with the problem of shared media markets by ruling out certain unacceptable 
study group configurations in advance and selected one of the remaining 
acceptable configurations at random, as long as each community ultimately 
has an equal chance of becoming an intervention or a control site. (This may 
be a particularly suitable context in which to use a randomization test for 
Statistical inference.) In brief, although a carefully designed random alloca- 
tion process may not prevent problems of treatment group comparability as 
neatly as it does with larger samples, it need not complicate them either. And, 
randomization offers other advantages: namely, a firm basis for formal hy- 
pothesis testing and a public perception of even-handedness in forming the 
comparison groups that is hard to achieve any other way. 
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Should Communities Be Matched? 


As noted above, matching can be used with or without randomization to 
achieve some degree of comparability between intervention and control 
groups or to enhance power. Theoretically, the best factor on which to match 
is one that is highly correlated with change in the outcome variable; in 
practice, there may be limited knowledge about which community character- 
istics qualify as good matching factors on this basis. Freedman et al (34) 
showed that a matching scheme that incorporated geographic proximity and 
community size appeared to perform well in forming matched pairs that were 
similar with regard to the prevalence of the target behavior at baseline. 
However, Martin et al (59) suggest that when the number of study communi- 
ties is small, matching should be used only in the presence of a very good 
matching factor, chiefly because the loss of degrees of freedom that results 
from using the community pair (rather than the individual community) as the 
unit of analysis can seriously compromise power and, in fact, weaken the 
comparison. 


LONGITUDINAL VERSUS REPEATED 
CROSS-SECTIONAL SAMPLES 


A central goal of most community-based health promotion programs is to 
reduce risky health behaviors in study communities. Surveys of community 
residents at two or more points in time are often required to obtain direct 
evidence on whether this goal is met. These surveys may use either longitu- 
dinal samples, which consist of a panel of individuals in each community who 
are surveyed repeatedly, or repeated cross-sectional samples, which consist of 
a fresh sample of individuals from each community on each survey occasion 
(usually with only a small probability of repeated selection of the same 
individual). Although this discussion is in terms of samples of individuals, 
similar comments apply to other possible subunits within a community, such 
as restaurants or schools. 

Several writers (1, 29, 36, 73) have commented on the relative merits of the 
longitudinal and repeated cross-sectional sampling approaches. The choice 
between the two depends on the correspondence between sample type and 
program objectives, on relative susceptibility to biases, on statistical power 
trade-offs, and on cost. Table 2 summarizes factors to be considered in this 
section. 


Correspondence with Program Objectives 


An important question is whether the intervention seeks primarily to change 
the health behavior of individuals, or to change the prevalence of risky 
behaviors in the community. These two kinds of changes are not the same, as 
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Table 2 Factors influencing a choice between longitudinal and repeated cross-sectional samples 








Factor Longitudinal Cross-sectional 





Program objective Directly measures change in indi- Directly measures change 
vidual health characteristics in community prev- 
alence of health char- 
acteristics 
Selection bias at recruitment May be worse because participation _ Participation may be 
is not anonymous anonymous 
Attrition Losses to follow-up may be relatedto Not a problem 
behavior being evaluated 
Testing Repeated questioning may be a co- Not a problem 
intervention 
Maturation Panel gets older, while community at Not a problem 
large may not 
History Panel consists of more long-term Less a problem 
community residents with expo- 
sure to “local history” 
Cross-contamination Not a problem Movement between inter- 
vention and control 
communities may 
dilute intervention 
effect 
Statistical power Higher for fixed sample size and Lower 
intervention effect 





communities are dynamic populations whose membership can change over 
time because of births, deaths, and in- and out-migration. A decline in the 
community prevalence of a behavior over time may occur, even in the 
absence of any individual-level behavior change, if individuals who join or 
leave the community differ systematically from other community residents in 
terms of their health behavior. Community-based programs usually do seek 
individual-level behavior change. But, sometimes, they may also change the 
social environment, deliberately or otherwise, through recruitment of persons 
with healthy behavior and out-migration of those with risky behavior, e.g. a 
worksite health promotion program may succeed in institutionalizing a prefer- 
ence for nonsmokers in hiring decisions, and it may make workplace smoking 
policies uncomfortable for smokers so that they seek jobs elsewhere. Other 
factors being equal, longitudinal samples are theoretically better suited to 
isolating program effects on individual behavior change, whereas repeated 
cross-sectional samples are better suited to measuring program effects on 
community-wide prevalence. 

If the survey involves a large fraction of the community, and if population 
turnover is low, one may, in fact, generate a longitudinal subsample within 
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the cross-sectional samples by repeated selection of the same individuals. In 
other situations, there may be ways to circumvent limitations of a specific 
sampling approach by altering other aspects of the survey methodology. For 
example, respondents in a follow-up cross-sectional survey can be asked 
about their length of residence in the community and about any changes in 
their health behavior that occurred during the study period. It may also be 
possible to supplement a longitudinal sample or to replace those lost to 
follow-up with newcomers during the study to render its composition more 
representative of the community at each time point, even though this option 
complicates data analysis. 


Susceptibility to Bias 


Table 2 also highlights certain sources of bias that can affect longitudinal and 
repeated cross-sectional samples differently; thus, “bias” means any systemat- 
ic difference between measured characteristics of the sample and the corre- 
sponding true characteristics of the population supposedly represented by the 
sample. 

Self-selection at recruitment can occur under either sampling approach 
because of nonresponse. Active refusal to participate is an important com- 
ponent of nonresponse (31, 42, 88), and concerns about privacy account for 
many refusals in some surveys (19). Although respondents can participate 
anonymously as part of a single cross-sectional sample, members of a longitu- 
dinal sample must reveal their identities and consent to be recontacted. These 
additional demands may further jeopardize willingness to participate. 

Attrition affects longitudinal, but not repeated cross-sectional, samples. 
Attrition can be large: In the Stanford Five-City Project, 39% of the baseline 
cohort completed three follow-up surveys over a five-year period (28). Sever- 
al longitudinal studies have found that individuals who smoke at baseline are 
more likely to drop out than those who do not (41, 45, 46). Other studies have 
found that subjects who are harder to follow are more likely to have worse 
exercise habits (54) or higher levels of substance abuse at follow-up (41, 53, 
69). These findings suggest that losses from a cohort often occur preferential- 
ly among those with worse health habits. 

Testing effects occur when changes in reported behavior are caused (or 
inhibited) by the act of repeated questioning. They affect longitudinal samples 
only. Although the possibility of such effects has long been known by 
psychologists (8, 16) and shown for nonhealth behaviors, such as voting (49), 
little evidence is available concerning testing effects on reported health 
characteristics. In the MRFIT STUDY (63), a larger discrepancy between 
self-reported and thiocyanate-adjusted quit rates for smoking in the interven- 
tion group compared with the control group at follow-up suggested the 
possibility of testing-treatment interaction. A study by Bridge et al (7) sug- 
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gests that repeated questioning resulted in shifting attitudes about cancer. 
Murray et al (62) found greater declines in smoking in a repeatedly questioned 
cohort of adolescents compared with a single, comparably aged cross- 
sectional sample. They inferred that the surveys themselves may have 
accounted for part of the difference. 

Maturation occurs in a longitudinal sample, which ages over time, whereas 
the age distribution in the community and in repeated cross-sectional samples 
may change very little. Any age-related phenomenon may thus appear to 
change in a longitudinal sample over time, even if the change had no relation 
to a community intervention. 

History may also preferentially affect longitudinal samples, which neces- 
sarily consist of longer-term community residents. Stable members of the 
community may have more exposure to local, nonprogram-related events that 
cause behavior change. 

Cross-contamination of treatment groups is at least a theoretical possibility, 
if mobility among study communities is high. With repeated cross-sectional 
samples, a follow-up survey participant may have recently moved from a 
control community to a study community, or vice versa, thus rendering the 
subject’s exposure status unclear. This kind of bias can be more of a concern 
if “community” is broadly defined to include such settings as workplaces or 
schools. 

Although these sources of bias can interfere with the degree to which the 
sample reflects the community at a given time point, they do not necessarily 
result in a biased estimate of program effect. If attrition affects longitudinal 
samples similarly in intervention and control communities, for example, this 
source of error would “cancel out” in a comparison between study groups. 
Likewise, bias that remains stable over time could still allow accurate estima- 
tion of a change in the prevalence of a characteristic over time. The strongest 
evaluation designs used to date have assessed program effect by comparing 
changes over time between intervention and control groups. Under such a 
design, the estimate of program effect would be biased only if there is 
interaction among size of bias, treatment group, and time, e.g. if repeated 
surveying renders a person more susceptible to an intervention effect, or if 
attrition of persons with unhealthy behavior occurs differently in the interven- 
tion group versus the control group. Unfortunately, little empirical evidence is 
available to judge how serious such potential threats to validity are in practice. 


Statistical Power 


A major attraction of the longitudinal-sample approach is its greater statistical 
power to detect change. This gain in power results from, and is quantitatively 
dependent on, intertemporal correlation in health characteristics at the in- 
dividual level: The more stable the characteristic, the greater the advantage of 
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a longitudinal sample for detecting a hypothesized change of a given size. 
Schlesselman (76) and Cook & Ware (15) discuss the statistical principles that 
underlie this conclusion. Koepsell et al (48) discuss performance of sample- 
size calculations for both sampling approaches. 

Given that the longitudinal-sample approach may be more susceptible to a 
variety of biases, as discussed above, Martin et al (58) derived a simple 
inequality that shows how large the added bias must be to outweigh the power 
advantages of a longitudinal sample, at least for a simple design situation. 
Specifically, consider a design in which r = the correlation between in- 
dividual’s baseline and follow-up health behavior status, n = the number of 
individuals surveyed per occasion, b, = the amount of bias in the estimate of 
mean change from baseline to follow-up based on a longitudinal sample, bx 
= the corresponding bias for a cross-sectional sample, and s? = the overall 
variance in behavior. Martin showed that when r < n(b,? — by”)/2s*, then a 
cross-sectional sample approach yields a lower expected mean-squared error 
than a longitudinal-sample approach. 

Unfortunately, a confident choice between sampling approaches depends 
on having good advance estimates of the likely extent of several kinds of bias 
and of the expected intertemporal correlation in the characteristics being 
measured. Moreover, all of these factors can be expected to vary from one 
behavior to another, so that the superior sampling approach for studying one 
behavior may be inferior for studying another. Perhaps for these reasons, 
several evaluations of large-scale community interventions have used both 
longitudinal and cross-sectional samples; these evaluations usually let the 
baseline survey sample serve both as the longitudinal sample and as the first 
cross-sectional sample (29, 44, 89). Building on this practical stratagem of 
safety through redundancy, Thornquist & Anderson (86) have recently pro- 
posed what they nickname a “belt and suspenders” method for combined 
analysis of data from longitudinal and cross-sectional samples, which uses 
generalized estimating equations. 


VALIDITY OF SELF-REPORTED HEALTH 
CHARACTERISTICS 


In community-based health promotion and disease prevention studies, in- 
formation about health behavior is often gathered directly from individuals 
through interviews or self-administered questionnaires. There is a widespread 
belief that people are inclined to overreport desirable health behaviors and 
underreport undesirable health behaviors. As more attention is paid to health 
behaviors in the media, in public places, in worksites, and in clinical practice, 
individuals, families, and different social groups may become sensitized to 
socially desirable forms of behavior. Therefore, methodologies to investigate 
and improve the validity of self-reports are important to develop and apply. 
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One major approach is to search for “objective” measures of behavior, on 
the assumption that they are free of subjective bias. Biochemical validation 
tests, such as those used in smoking research, are prized for their criterion 
validity. These “gold standard” measures, however, may be too costly, as 
well as vulnerable to between-individual variation in absorption, metabolism, 
and excretion (37). One investigative team even concludes that “. . . question- 
naire response appears to be the standard against which physiologic test of 
smoking must be judged, not vice versa” (68). Self-reports often become the 
only feasible method for collecting data on health behaviors. We summarize 
here published evidence for the validity of self-reports for two forms of health 
behavior that have been common targets of community-based interventions: 
cigarette smoking and dietary behavior. We also discuss the major methodol- 
ogies for evaluating and improving these reports. 


Cigarette Smoking 


A recent review and meta-analysis of studies, which uses biochemical valida- 
tion of smoking behavior, suggests that self-reports of cigarette smoking 
obtained by in-person interviews have fairly high sensitivity and specificity 
among adult respondents who participate in community studies, when ex- 
amined in relation to a biochemical measure of smoking status (66). Similar 
validation studies, which have been carried out among students, suggest that 
self-reports among adolescents involved in smoking cessation interventions 
are less accurate. Biochemical validation remains desirable in evaluations of 
smoking cessation interventions. 

Biochemical validation cannot determine, however, the accuracy of reports 
regarding smoking consumption, i.e. the number of cigarettes smoked (85). 
Nor can biochemical tests be used to validate smoking histories that yield 
estimates of risk in terms of pack-years. Lifetime smoking consumption is 
likely underreported, given the difficulties of long-term recall. 

Several methodological techniques have been used to evaluate and improve 
self-reports of smoking behavior. Studies of surrogate reports of behavior, 
usually next-of-kin and particularly spouses, indicate that self-reports of 
cigarette smoking correlate highly with surrogate reports (60). 

Other studies have suggested that informing subjects that a biochemical 
measure of cigarette smoking, such as salivary cotinine or expired carbon 
monoxide, is to be obtained improves the validity of self-reports (4, 26). In 
some instances, bogus measurement procedures are used, or biochemical 
samples are obtained but never analyzed. This approach has been called the 
“bogus pipeline.” When genuine objective measures were used in research 
with adolescents, Bauman & Dent (4) found that adolescents who had recent- 
ly smoked reported significantly greater amounts of smoking if they were 
informed about the biochemical measure before completing the questionnaire. 
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Unfortunately, published studies evaluating self-reports of cigarette smok- 
ing seldom contain the actual questions used to classify smokers and 
nonsmokers. Thus, the form and content of the questions themselves are 
difficult to evaluate. The actual wording of questions can influence the 
responses given and, hence, the categorization of respondents as smokers 
(87). Therefore, studies asking about smoking should report or reference the 
questions used, so that this potential source of invalidity can be examined and 
controlled. They should also report whether subjects were told before answer- 
ing questions that they would later be asked to provide a specimen for 
biochemical validation. 


Dietary Behavior 


A problem with assessing dietary behavior through self-report is that eating is 
a mundane, frequent behavior that a person does with relatively little atten- 
tion. At least three methods have been used in community-based studies to 
assess dietary change: nutrient intake (diet records, 24-hour recall, and food- 
frequency questionnaires); biochemical measures (primarily serum choles- 
terol); and approaches aimed at the specific targets of the intervention (mea- 
sures of individual behavior, such as “Yesterday, did you eat a vegetable with 
dinner?”, or environmental measures as discussed below, such as percent of 
supermarket milk shelf space devoted to lowfat milk) (50). 

The lack of a criterion measure of dietary intake in free-living persons is the 
major problem in evaluating the validity of these measures. Assessing con- 
vergent validity (concurrence among different measures) is a common alterna- 
tive. In general, correlations among various nutrient intake measures are 
rarely above 0.6 and, depending upon the nutrient, are frequently as low as 
0.3 (55). Even for food frequency questionnaires, which are designed to 
minimize intra-individual variability, test-retest correlations are rarely above 
0.65 and may be as low as 0.2 (84). 

A special threat to validity arises from the nonblinded nature of most 
community dietary intervention studies. If the intervention program has an 
effective public education component, residents of intervention communities 
understand the relationship between food and health better and pay greater 
attention to food and food choices. These intervention effects could influence 
measurements in the absence of behavior change, thus confounding any 
interpretation of contrasts between intervention and control communities. The 
act of retesting a cohort may also produce biases in reported behaviors, as 
discussed earlier. Unfortunately, there are few data with which to substantiate 
or estimate the magnitude of these potential biases. 

In practice, 24-hour recalls and food records have usually been deemed too 
expensive, time-consuming, and difficult to administer for use in large-scale 
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community studies. Food frequency questionnaires and abbreviated question- 
naires of behavior specifically targeted by the intervention, methods that rely 
upon retrospective reports of dietary habits, are most often the only practical 
means for assessing dietary change. The cognitive processes that underlie 
responding to these methods are complex. For example, food frequency 
judgments require individuals to assign “typical frequency” and portion size 
judgments for what is often a long list of food items. An inferential process, 
by which a frequency judgment is derived at the time the question is asked, 
must occur. Little research has been done to investigate these cognitive 
processes and their potential for biasing reported dietary behaviors (82, 83). 

Epidemiologic studies of dietary behavior (93) have found that respon- 
dents’ reports of current dietary behaviors, or recall of previous behaviors, 
depend on whether the foods are perceived as socially desirable or personally 
relevant. Comparisons with daily food records have indicated overestimates 
of up to 50% on food frequency judgments for “healthy” foods and un- 
derestimates of up to 30% for “unhealthy” foods (74). 

Various approaches need to be investigated to both assess and minimize 
these biases in dietary recall. For example, social desirability and food 
salience scales may be included in evaluation schemes (18). Less direct 
approaches include making the dietary intake assessment an adjunct to some 
other task not so closely related to health habits (e.g. embedding it in a longer 
series of questions about consumer buying behavior). Another approach 
might be to include bogus foods (e.g. lowfat olive oil) in food frequency 
questionnaires to estimate the overreporting of “healthy” foods. 

Both laboratory-based and community-level studies are needed to advance 
our understanding of how individuals evaluate and report health behaviors, 
and whether any biases we find differ for persons in community intervention 
and control communities. Over the last few decades, the accumulated re- 
search suggests that self-reports of smoking require biochemical validation in 
intervention studies, particularly with adolescents in school-based cessation 
programs. The lack of such biochemical measures for self-reported dietary 
behavior adds considerable complexity to the assessment of an inherently 
complicated and multifaceted behavior. 


MEASURES OF COMMUNITY ENVIRONMENT 


Several complicating factors that arise in assessing the outcome of com- 
munity-level interventions enhance the attractiveness of a class of measures 
that we call “environmental” indicators. This section briefly describes the 
complicating factors, defines environmental indicators and places them in the 
context of other community-level measures, and gives some examples. Chea- 
dle et al (12) provide a more complete discussion of this-class of measures. 
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As noted above, two difficulties arise in assessing community-level in- 
terventions: the impossibility of blinding individual subjects to the presence of 
the intervention, which threatens the validity of self-reported attitudes, be- 
haviors, and outcomes; and the complexity of the mechanism of action by 
which community programs change individual behaviors, with many in- 
termediate steps in the behavior change process. As illustrated by the causal 
model shown Figure 1, these intermediate steps often involve modifying that 
which can be labeled the “community environment,” defined broadly to 
include the legal, social, and economic, as well as the physical, environment. 
Components of the health-related community environment include institutions 
(stores, worksites, political institutions), geography (air, water quality), 
media messages (TV, radio, print), laws, and regulations (smoking ordi- 
nances). 

Environmental indicators thus serve two functions in an evaluation. First, 
they provide an indicator of shared attitudes and/or collective behavior that 
does not rely on self-reports. Second, they capture features of the environ- 
mental link in the chain that connects health-promotion programs to changes 
in health-related behavior. 

Environmental indicators are derived from observations of the community 
context in which people live. To clarify this notion, it is useful to relate them 
to other “community-level measures,” i.e. approaches to characterizing the 
community as a whole as opposed to individuals or subgroups within it. 
Community-level measures can be divided into three sub-categories: “in- 
dividual-disaggregated”—information originally obtained on individuals for 
whom individual-level covariate data (e.g. demographic characteristics) are 
available that can be considered in analyzing and interpreting community- 
level summary statistics; “individual-aggregated”—measures derived from 
individual-level information, but available only in aggregated form; and 
“environmental indicators’—measures based on observations of the commu- 
nity environment. 

Most community-level measures that have been used in evaluating health 
promotion programs fall into the first category, i.e. are based on individual- 
level measures (e.g. interview surveys, physiologic measures) through which 
additional information on each respondent is available. These data are most 
frequently gathered by investigators who are evaluating a particular health- 
promotion program, but could easily include other public-use data available at 
the individual level (birth and death tapes, hospital discharge abstracts). 
Community-level measures formed by aggregating individual-level data, de- 
void of identifiers, to the community level include data collected by agencies 
other than the program evaluators. Examples of aggregate measures include 
census data, mortality rates, traffic-accident statistics, and most economic 
data (e.g. sales information). 
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Environmental indicators, the third class of community-level measures, are 
derived from observations of aspects of the community environment that, like 
other community-level measures, are then aggregated to the level of the 
community. For example, the number, type, and visibility of nonsmoking 
signs in a workplace (which can be regarded as a small community) are an 
environmental indicator of the attitudes of the workers and management in 
that workplace toward smoking. Greater degrees of militancy toward smoking 
among employees and management will probably be associated with more 
and better-advertised no-smoking areas. In addition, the number and character 
of workplace no-smoking signs are indicators of the environmental influences 
acting on employees. 

There are several strands in the existing literature relevant to environmental 
indicators. Since the mid-1960s, a substantial literature on social indicators 
and social indicator models has accumulated in sociology (9, 52, 72). For 
example, Carley (9) presents indicators derived from the Social and Economic 
Accounts System (SEAS) developed by Fitzsimmons & Lavey (32), which 
organizes 477 community indicators into 15 sectors. Health sector social 
indicators in the SEAS include individual-aggregated measures (e.g. number 
of deaths per 1000 live births), as well as measures that could be classified as 
environmental indicators: number of full-time equivalent physicians, hospit- 
als, and hospital beds. 

Another close relative to environmental indicators in the existing literature 
are the “unobtrusive” or “nonreactive” measures first collected and catego- 
rized by Webb et al (90). A measure is unobtrusive if the object of interest is 
unaware of being observed. Nonreactive measures do not suffer from the 
problem of reactivity bias, i.e. the “true” response is not altered by the 
process of measurement. All unobtrusive measures are nonreactive, but some 
nonreactive measures may be highly obtrusive (e.g. blood tests). In many 
cases, these unobtrusive measures would be classified as individual-level 
measures under our scheme, as the observations are made on individuals and 
then aggregated to get an estimated mean or proportion for the group of 
interest. However, several other measures reported in the literature are based 
on characteristics of the community environment [e.g. graffiti (78)] and can, 
therefore, be classified as environmental indicators. 

Table 3 provides examples of community-level measures related to tobacco 
use. These measures are categorized along two dimensions: the measurement 
category (individual-disaggregated, individual-aggregated, and environmen- 
tal) and the obstrusiveness and reactivity bias likely to be associated with the 
measure. The environmental measures are further subdivided according to the 
component of the environment being measured (e.g. workplace, restaurant). 

The examples in Table 3 may help clarify the earlier discussion of terminol- 
ogy. The newspaper poll of attitudes could, in principle, be shifted to the 





KOEPSELL ET AL 


52 


*S9LIOZI}VI JUSWIIIMSVIUI JO UOISSNISIP OJ 3X9) 90g, 





{2oueulpso 

Suryowlsuou e sARYy Aj}IUNUWIWIOD s30q 
suis Suryours 

-uoU jo Ajtpiqista ‘aouayeaoid Aenysy 
seore 

SZuryowsucu poyeusisap jo sousj;eacdlg 


QOULUIPIO SuryOUIsuOU B IJOAO JOA 
B WOIJ S}[NsaI UOTOI]q ‘sayes syeeSID 


sing oyereS19 
JUNOS 0} dBeqie3 pjoyssnoy jo a[dues 


SQOULUIPIO JO AJOSIY DATIRISI 

-3o] :syueWOJUT Ady UJIM SMOTAIO}U] 
Suleos Suryoursuou 

aWOS YIM sjuBINe}SoI JO uOINOdolg 


sarojod Suryours Auedwoos jo Aaaing 


SOINSBOU! DUTUTJOD [9A2]-WOOISSEID 


snjejs Suryouls Jo soins 
BOW [ROTWAYOOIG JoyIO ‘aUTUTOD 


sopmie Aytunu 

-W09 :sjURULIOJUI Ad¥ YIM SMITAIOU] 
seoze Suryoulsuou ul ayours 

OYM SIBUIO}sNd 0} asuodsal JIS 
Aoyjod Suryouls uo smMdIA JOY 

/sty ‘OD Aueduios ym MOIAIOUT 


saovjd o1qnd 

UI SuTyOUIs PIeMO} sopnNye Jo 
jjod sodedsmou jo sjjnsar poysijqng 

saovyd 

oqnd ut Zuryows premo} sopny 
-178 ‘snjejs Suryous yo Asains suoyg 


gjoym e se AjrunUTWIOD 

sooeds o1jqnd 

JOJO pue sjUBINEIsOYy 

DUSHIOM 
[ejuowUOIAUq 


poyesoi33y-[enplAIpuy 


poyesois3esiq-[enpraipuy 





DAISIGOUL) 


DATOBIIUON 


SATIOVIY 





dAISNUIGO 


,AlOSa}eV> JUSWOINSBI| 








JOIARYOG puke sapnjije poyelos-Suryours Jo somnseow! jaaoj-Ayunwwio0s jo sojdwexq ¢ aquyl 





HEALTH PROMOTION PROGRAM EVALUATION 53 


individual-disaggregated category, if the newspaper collected demographic 
information on the respondents and made the individual-level information 
available to outside investigators. The worksite environmental indicators 
cover aspects of company smoking policy. The interview with the company 
president is likely to be colored by concern about public relations, and thus 
subject to a considerable amount of reactivity bias. The company will also be 
aware that a survey is being conducted of its smoking policy, but because the 
assessment could focus on written policy statements, there is less chance of an 
untruthful response. The observation of the prevalence of no-smoking areas 
could be made unobtrusively, if admittance were gained for some reason other 
than to conduct such a survey (e.g. the observations could be made by an 
employee). 

The advantages of environmental indicators have already been noted: They 
are frequently unobtrusive and, therefore, not subject to response bias. And, 
they are measurements of important intermediate factors in health-promotion 
interventions. The drawbacks of environmental indicators are the same ones 
that have held back the development of unobtrusive measures in social 
psychology: lack of persistent and credible efforts to assess and improve the 
validity and reliability of candidate measures (78). An effort to overcome this 
lack of evidence for environmental indicators has begun, however. For 
example, the reliability of a grocery store instrument designed as an environ- 
mental indicator of dietary habits has been assessed as part of the evaluation 
of the Kaiser Family Foundation Community Health Promotion Grants Pro- 
gram (11, 89). The validity of the grocery store instrument has also been 
assessed, by comparing the results of the survey with a phone survey of 
individuals in the same communities (10). Only through such a process of 
accumulating information about validity, reliability, and responsiveness to 
change can a fair test of these measures be conducted. 


CONCLUSIONS 


At present, the community-based approach to health promotion appears to be 
in an expansion phase, spurred in part by the apparent success of several 
large-scale, community-wide programs aimed at prevention of cardiovascular 
disease. New programs are now being developed for a wider array of health 
conditions, the definition of “community” is being broadened to include both 
larger and smaller social units, and the range of target populations is being 
widened demographically and socioeconomically. Many newer community- 
based programs are being mounted with fewer resources and a different mix 
of intervention modalities than their predecessors. All of these factors empha- 
size the importance of rigorous evaluation to determine when, where, how, 
and for whom the community-based approach succeeds. We hope that the 
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above discussion helps sensitize evaluators to the special challenges they face 
in attempting to answer those important questions and kindles the interest of 
methodologists to develop new and better evaluative tools. 
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INTRODUCTION 


In 1990, the Secretary of Health and Human Services unveiled Healthy 
People 2000: National Health Promotion and Disease Prevention Objectives 
(46), which is a milestone in public health. Healthy People 2000 identifies 
three national health goals: increase the span of healthy life, reduce health 
disparities among Americans, and achieve access to preventive services for all 
Americans. The report also details 300 specific objectives for health promo- 
tion and disease prevention programs with quantitative targets to be achieved 
by the year 2000. Meeting these objectives requires agreement by public 
health statisticians on measures of individual and community health status to 
guide public health policy development and priorities, especially for state and 
local areas, and improvement in the methods for tracking these measures. 

Healthy People 2000 challenges public health practitioners to develop 
surveillance systems that are both meaningful in a public health sense and 
Statistically sound. To clarify this challenge and outline some possible re- 
sponses, I begin with some background on the public health assessment 
activities on which the year 2000 health objectives build and considerations 
that should guide public health assessment efforts. The next section presents 
general statistical issues in formulating measurable and meaningful objec- 
tives. Other sections are devoted to two specific issues: the development of a 
small set of health status indicators that is both meaningful and feasible to 
monitor and special issues associated with setting objectives and determining 
appropriate targets for state and local areas. 
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BACKGROUND 


“Public health assessment,” as used in the Institute of Medicine report The 
Future of Public Health, is the regular and systematic collection, assembly, 
analysis, and dissemination of information on the health of the community. 
This information includes statistics on health status, community health needs, 
and epidemiologic and other studies of health problems (15, p. 7). 

The 300 objectives in Healthy People 2000 indicate the information needed 
to guide public health policy at the state and local, as well as the national, 
levels. Healthy People 2000 explicitly calls for strengthened public health 
assessment efforts by devoting one of its 22 priority areas to specific objec- 
tives for surveillance and data systems. 

Healthy People 2000 also provides a means of communicating achievable 
health goals and the means for achieving them. Moreover, the report provides 
a means of measuring progress towards the national goals, of taking credit for 
battles won, and of assigning responsibility for further efforts (25). 

The “Year 2000 Health Objectives Planning Act” (PL 101-582) requires 
the Secretary of Health and Human Services to implement the surveillance 
objectives and funds states to develop plans to monitor and improve the health 
status of their populations. 

The information base for public health assessment is broad. The World 
Health Organization, for example, has published a report on the development 
of health indicators for its Health For All goal, which has guided national 
efforts in a number of countries (53). A decade ago, the US Public Health 
Service published national goals in the original Healthy People (44), as well 
as 226 specific health objectives for 1990 (48). In 1987, the National Com- 
mittee on Vital and Health Statistics reviewed the status of health promotion 
and disease prevention data at the state and local levels and made recom- 
mendations to improve the use of existing data, to develop and promote 
strategies for sharing expertise in the use of this data, and to develop alterna- 
tive methodologies to meet state and local data requirements (30). Many state 
and local health officers use the Model Standards for Community Preventive 
Health Services, a collaborative effort of the Centers for Disease Control, the 
American Public Health Association, and several other public health pro- 
fessional associations to guide local assessment efforts (1). The Public Health 
Foundation has developed core data sets for reporting on state public health 
activities related to the objectives (34), and the National Association of 
County Health Officials’ APEX program has developed methods for assessing 
public health needs and resources at the local level (29). 

Individual efforts to develop health status indicators must also be acknowl- 
edged (9, 19, 26). Among many such efforts, Murnaghan (28) has reviewed 
the status and priorities for health information systems needed for Health for 
All by the Year 2000 with a special focus on developing countries. 





PUBLIC HEALTH ASSESSMENT 61 


CONSIDERATIONS FOR PUBLIC HEALTH 
ASSESSMENT 


The twentieth century has seen a shift in the major causes of death from 
infectious to chronic diseases (6), which requires an increased emphasis on 
behavioral and environmental risk factors and on preventive medicine. In- 
dividual and community interventions can help prevent important health 
problems, such as injuries, teenage childbearing, and mental health problems. 
Efforts are still needed to sustain and improve on our historical successes in 
preventing infectious disease and to find ways to address such emerging 
problems as AIDS. 

One aspect of this shift in focus is that changes in mortality rates can no 
longer be thought of as proxies for changes in morbidity or health status. 
When infectious diseases represented the major health problems, before 
modern medicine was able to control them effectively, mortality was a good 
proxy for the incidence of the disease that caused it. Modern medical care, 
however, has broken the link between incidence and mortality. 

Furthermore, the chronic diseases and conditions that have replaced in- 
fectious diseases as the major causes of death and disability are more complex 
and require new outcome measures (21, 33). Individuals live longer with 
these conditions, often in poor health and with a low quality of life. The 
diseases typically have long asymptomatic stages. Thus there is a need for 
prevalence and incidence measures, as well as measures of severity and 
disease staging, functional limitations and disability, and quality of life. 

These changes also bring a new focus on the antecedent risk factors for 
injuries and chronic diseases and conditions. By one estimate, two thirds of 
all deaths and years of life lost before age 65 are attributable to a preventable 
precursor, and are thus unnecessary or premature (2). Behavioral risk factors, 
such as smoking, can themselves be thought of as negative aspects of an 
individual’s health status. Furthermore, the physical and social environments 
are increasingly viewed as important risk or protective factors, and thus they 
are also targets of intervention. 

There is, however, danger in confusing ends and means. Risk factor 
objectives are useful because they show the impact of behavioral interventions 
long before there are any changes in chronic disease mortality rates. Howev- 
er, risk-related behaviors, the use of preventive services, and the availability 
of community health protection programs should be monitored as in- 
termediate outcomes and indicators of program success or failure, but not as 
alternatives to direct health status measures. 


STATISTICAL ISSUES IN SETTING OBJECTIVES 


Responding to the change in mortality and morbidity patterns, the national 
objectives in Healthy People 2000 suggest areas in which health status 
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measures are possible and needed. The objectives spell out not only specific 
health status targets, but also changes in individual risk factors and in the 
physical and social environment that can help reach the goals. The range of 
topics covered by the objectives is extensive. It includes personal behavior 
and risk factors, including physical fitness and activity, nutrition, tobacco, 
and alcohol and other drugs; psychosocial factors, including mental health 
and violent and abusive behavior; the physical environment, including un- 
intentional injuries, occupational safety and health, environmental health, and 
food and drug safety; infectious diseases, including HIV infection, and 
sexually transmitted diseases; reproductive and infant health, including family 
planning and maternal and infant health; chronic diseases, such as heart 
disease and stroke, cancer, diabetes, and oral health problems, and chronic 
disabling conditions; and services and protection, including educational and 
community-based programs, as well as clinical preventive services. 

The comprehensive list of topics offers a catalog of health status measures 
from which states and local areas can choose. In setting forth its national 
objectives, Healthy People 2000 also identifies almost 100 separate data 
sources and many measurement tools for public health assessment. The report 
contains 300 separately stated objectives, some of which have multiple parts, 
thus leading to almost 400 statistical series that must be monitored. Un- 
fortunately, those testifying at hearings organized by the Public Health Serv- 
ice and the Institute of Medicine identified the large number of 1990 objec- 
tives as an impediment to effective assessment and implementation efforts 
(42). With the more than 200 additional special population targets for high 
risk groups, quantity alone makes monitoring the year 2000 objectives at the 
national, state, and local levels a formidable challenge. 

Stating the objectives in quantitative terms is one of their great strengths, 
but the availability of data to measure progress has been a problem since the 
beginning of the Healthy People process. Green and colleagues (13), for 
instance, reviewed the data available in the early 1980s to track progress 
towards the 1990 objectives and found numerous gaps and statistical prob- 
lems. Data were eventually acquired for several objectives that were pub- 
lished in 1980 with no baseline information. Indeed, this was one of the 
intended outcomes of setting the objectives (37). At mid-decade, however, 
there were no tracking data on more than one quarter of the 1990 national 
objectives (47). Despite explicit criteria in the development of the year 2000 
objectives aimed at mitigating these problems (24), about one quarter of the 
year 2000 objectives in Healthy People 2000 cite no currently available 
baseline data (40). The large number of objectives in this situation calls into 
question the availability of sufficient data to assess progress in the 1990s. 

Andersen & Mullner (3) identify several other statistical problems with the 
1990 health objectives, and many of these problems persist in Healthy People 
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2000. For instance, about one quarter of the year 2000 objectives list baseline 
data that do not correspond to the stated objective. One objective calls for 
75% of primary care providers to “provide nutrition assessment and counsel- 
ing and/or referral to qualified nutritionists or dietitians,” but states as base- 
line data that “physicans provided diet counseling for an estimated 40-50% of 
patients” (46, p. 128). This baseline and objective disagree in terms of which 
type of provider is to provide the service, the service they provide, and the 
population to which the percentage is applied (40). The lack of precision in 
this objective might reflect underlying uncertainty among health professionals 
about who should provide which services. 

Three specific statistical issues are discussed below: the specification of 
individual objectives, the interpretation of trends, and standardization. 
Specification of Individual Objectives 
In many respects, the individual objectives are similar to items on a survey 
questionnaire. If the results are to be interpreted with confidence, careful 
development and testing are needed to ensure that the objectives are op- 
erationalized in a clear and unambiguous way. Most of the objectives in 
Healthy People 2000 are carefully written in this respect, but others exhibit 
several statistical problems that should be avoided. 

Some of the objectives are not written in a statistically operational form; 
that is, even with all of the information in hand it will be difficult to tell if the 
objective has been met. For example, objective 7.17 calls for local ju- 
risdictions to have “coordinated, comprehensive violence prevention pro- 
grams: (46, p. 240). Although a long list of attributes of coordinated and 
comprehensive programs is given in the text, no operational definition is 
provided by which to judge whether a particular jurisdiction’s program is 
coordinated and comprehensive. 

Other objectives address very complex questions that are difficult to moni- 
tor through population surveys. For example, objective 5.8 is to “increase to 
at least 85% the proportion of people aged 10 through 18 who have discussed 
human sexuality, including values surrounding sexuality, with their parents 
and/or have received information through another parentally endorsed source, 
such as youth, school, or religious programs” (46, p. 198). Although survey 
data could provide information on aspects of this objective, it is difficult to 
imagine how questions could be designed to assess the proportion of adoles- 
cents that meet the specific implied criteria. 

These problems arise because Healthy People 2000 often does not distin- 
guish between general health issues and operational measures of these issues. 
Rarely are data available in the precise form that policymakers prefer, so 
concessions must be made to data constraints. The presentation of the objec- 
tives should reflect this compromise by separately identifying the issues to be 
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monitored and the best available data or proxy variables for these issues, and 
by stating targets in terms of the measurable quantities. 

Some objectives state two or more separate goals. For example, objective 
20.1 calls for reductions in the number of cases of each of eight vaccine- 
preventable diseases (46, p. 513). Such multibarreled objectives are trouble- 
some because they implicitly increase the number of objectives and contribute 
to the surveillance problems discussed above. 

Some multibarreled objectives are especially vague because the target 
groups are not clearly identified. For example, objective 9.19 states: “Extend 
requirement of the use of effective head, face, eye, and mouth protection to 
all organizations, agencies, and institutions sponsoring sporting and recrea- 
tion events that pose risk of injury: (46, p. 285). Such objectives are very 
difficult to monitor for two reasons. First, data must be obtained for all of the 
groups mentioned or implied in the list of target populations. In many cases, it 
is not clear where this list ends. Second, the data from the different pop- 
ulations must be combined into one overall percentage to compare against the 
objective target; often, the stated objective does not say how to do this. 

Multibarrelled objectives reflect a resistance to reducing the list of objec- 
tives to a manageable number. Public health issues are complex and admit to 
many solutions, so such resistance is understandable, but the cost in terms of 
resources needed to measure the additional measures and reduced comprehen- 


sion of the overall message can be high. Priorities must be set among the 


measures, perhaps by using the approach discussed in the section on health 
status indicators. 


Interpretation of Trends 


Population-based health interview surveys provide many of the core health- 
status measures used in the year 2000 objectives. However, health interview 
data, especially trend data, can be difficult to interpret (50). The US National 
Health Interview Survey, an important source of data for the year 2000 
objectives, measures the annual incidence of acute conditions and the preva- 
lence of chronic conditions through a combination of open- and closed-ended 
questions about the presence of specific diseases and conditions. A common 
finding from these data has been that chronic illness and disability have been 
increasing at the same time that mortality (even for related diseases) has been 
falling. At least part of this increase does not reflect actual worsening in 
physical illness. Methodological explanations that may explain the trend 
include improved survey design that may have increased the proportion of the 
population reporting diseases and conditions that exist; improved access to 
medical care and better screening efforts that may have increased the pro- 
portion of the population diagnosed with, and therefore aware of, asymp- 
tomatic disease; and changing role expectations and improved disabil- 
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ity benefits that may have increased the proportion of the population that 
reports work-related disability (50). 

Several objectives rely on numbers of individuals who receive treatment for 
the disease in question, because of the lack of population-based data on the 
incidence or prevalence of specific diseases. For instance, objective 15.3 calls 
for a reversal in the increasing number of persons with “end-stage renal 
disease (requiring dialysis or transplantation)” (46, p. 397). The baseline 
figures cited, however, count the number of persons who receive dialysis or 
transplantation, not those who require it. Thus, these trends reflect changes in 
diagnostic and treatment patterns, as well as access through an expanding 
federal program. It is doubtful whether future changes in the data can be 
attributed to the success of the prevention activities intended by Healthy 
People 2000. 


Standardization 


Standardization methods are used to account for demographic changes in a 
single population over time (12). For instance, if there were no changes in the 
age-specific cancer rates between 1987 and 2000, aging of the population 
alone would cause the overall death rate to increase from 195.9 to 217.1 per 
100,000, assuming the Census Bureau’s median population projection for the 
US (38). 

Standardization also serves a second, and very different, purpose. Because 
state and other geographic areas differ in the age, race, and sex composition 
of their population, they have different unadjusted rates. For example, Florida 
has a large population of elderly people and a high unadjusted death rate. 
Adjusted rates provide a fairer comparison among areas. 

For some purposes, however, standardization could lead to difficulties. 
Standardized rates can present a different impression about the relative im- 
portance of the different causes of death, depending on the standard used. For 
example, accidents and adverse effects have a somewhat higher mortality rate 
than cerebrovascular diseases when adjusted to the 1940 population (35.0 
versus 29.7 per 100,000), but the crude cerebrovascular mortality rate is more 
than 50% higher than the crude accident mortality rate (61.2 versus 39.5 per 
100,000). 

The difference is even greater for the overall cancer death rate. The 1987 
rate is 50% higher (199.9 compared with 132.9 per 100,000) when the 1990 
population, rather than the 1940 population, is chosen as the standard. The 
choice of standard also affects trends. If we use the 1990 standard, the cancer 
death rate increased by 6.2% between 1970 and 1987; with the 1940 standard, 
however, the increase is only 2.3% Neither one of these standards is “correct” 
in any absolute sense, but they give a very different impression. 

Many statisticians favor using the 1940 US population as a standard, 





66 STOTO 


primarily because it would be consistent with the long-term practice of the 
National Center for Health Statistics and others in reporting mortality rates. 
Using this standard would facilitate the efforts of states that try to monitor 
their own progress on the objectives. Others argue against adjusting, es- 
pecially to the 1940 population, because it masks the public health impact of 
the levels seen in crude death rates. One compromise would be to standardize 
the rates to a more recent population, such as the 1990 US population. This 
would give a better picture of the current public health impact of the various 
diseases (as measured by the relative numbers of deaths) and would provide 
the analytical benefits of age-adjustment. The difficulty with using a new 
standard is that special calculations are needed to adjust past data for trend 
analyses. The need for general changes in statistical reporting systems associ- 
ated with Healthy People 2000, however, might provide an opportunity to 
switch all mortality reporting to a more current standard. 


HEALTH STATUS INDICATORS 


The 300 specific objectives in Healthy People 2000 are useful for public 
health officials who try to design programs in particular areas, but they are too 
numerous for the public and political leaders to follow on a regular basis. A 
short list of health status indicators that can be understood and followed on a 
regular basis is necessary to maintain public and professional attention (15, 
42). For instance, at a 1990 hearing on the year 2000 health objectives, 
Senator Jeff Bingaman called for the development of a short list of objectives 
that could be used for annual health “check-ups” at the national, state, and, 
perhaps, substate levels (31). 

Healthy People 2000 recognizes this need. It calls for the development and 
implementation of “a set of health status indicators appropriate for Federal, 
State, and local health agencies,” and an initial set of indicators was published 
by the Public Health Service in 1991 (7a). In light of the importance of the 
indicators and the likelihood that they will continue to be developed and 
refined throughout the 1990s, the following suggestions are offered about 
criteria for selecting the indicators. I also present a potential list of measures 
and discuss alternative approaches. 


Criteria for Health Status Indicators 


Health status indicators should sum up, almost at a glance, the health status of 
the community to which they apply: the nation, a state or local area, or some 
other defined population. Although policy makers and the media might prefer 
a single index of the health of the community that could be monitored on a 
regular basis (much like the nation follows the Gross National Product to 
assess the health of the economy), the multidimensional nature of health 
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status means that no single, all-encompassing health index can pain: a com- 
plete picture of the health status of the community. It should be possible, 
however, to construct a relatively short list of indices that, taken as a group, 
sum up most of the important aspects of the community’s health. 

Based on the work of the social indicators movement, Andrews (4, p. 27) 
has summarized some of the key characteristics of health indicators: “a limited 
yet comprehensive set of coherent and significant indicators which can be 
monitored over time, and which can be disaggregated to the level of the 
relevant social unit.” 

Organizing the health status indicators by age offers several advantages. 
First, this structure reflects life cycle patterns in health problems and priorities 
and the frequent use of age as an organizing principle in discussions of health 
promotion and disease prevention. This organization also provides continuity 
with the original Healthy People (44) and the 1990 health objectives (48). 
Furthermore, as Chapter 2 of Healthy People 2000 (46) shows, organizing the 
objectives by age groups makes it easier to identify the issues to include; 
looking at one age group at a time obviates the need for imponderable 
comparisons between, say, reduction in teenage fertility and improvements in 
the quality of life for older persons. 

Finally, the individual indicators must accurately reflect changes in the 
health of the community, rather than changes in health delivery systems or 
reporting systems. The indicators should be, to the extent possible, available 
and interpretable at the state and local levels, and the number of indices 
should be small enough so that both the public and public health policy 
makers can understand the message that the indicators carry. 

The health status indicators are not a substitute for a comprehensive public 
health assessment system; they are intended to help a community compare the 
health of its population with other communities as a means of identifying 
problems and goals. Indeed, comprehensive information on particular health 
problems and risk factors are needed to follow up on the leads suggested by 
the indicators. 


Proposed Health Status Indicators 


To illustrate the tradeoffs necessary in developing a short list of health status 
indicators and to serve as a point of departure for further development of 
indicators, Table 1 contains a list of 23 particular measures. The list is 
organized in five age groups, and the table gives the major health issue to be 
addressed (“low birth weight,” for example) as well as the specific measure(s) 
of it in the text that follows (percent of live births that weigh less than 2500 
grams). 

To reflect the national goal of reducing disparities between population 
groups, special target groups for some of the measures in which the dispari- 





Table 1 Proposed health status indicators 








Infants (Under 1 year) 
Deaths: Infant mortality rate (9.1 deaths of infants under 1 year per 1000 live births in 1990) 
Low birth weight: Births of babies weighing less than 2500 g (6.9% of live births) 
Prenatal care: Proportion of infants born to women who received prenatal care in the first 
trimester of pregnancy (76%) 

Children (Ages 1-14) 
Deaths from injury: Death rate for accidents, homicide, and suicide combined (16.6 deaths per 
100,000 children ages 1-14) 
Immunization: Proportion of children ages 1-4 reported immunized for measles, rubella, DPT, 
polio, and mumps (55.3-64.9% for each immunization separately in 1985) 
Toxic exposures: Prevalence of blood lead levels exceeding 15 g/dL among children ages 6 
months through 5 years (15.4 per 100,000 in 1984) 








Adolescents/Young Adults (Ages 15-24) 

Deaths from injury: Death rate for accidents, homicide, and suicide combined (75.8 deaths per 
100,000 persons ages 15-24) 
Teenage childbearing: Birth rate at ages 15-17 (33.8 live births per 1000 women ages 15-17 
in 1988) 
Use of dangerous substances: Proportion of adolescents ages 12-17 who used the following 
substances in the month 

Tobacco: (12% in 1990) 

Alcohol: (25% in 1990) 

Cocaine: (0.6% in 1990) 
Sexually transmitted diseases: Incidence of gonorrhea (1123 cases reported per 100,000 adoles- 
cents ages 15-19 in 1989) 


Adults (Ages 25-64) 
Premature chronic disease mortality: Death rate for cancer, heart disease, stroke, and diabetes 
combined (264.5 deaths per 100,000 persons ages 25-64) 
AIDS/HIV: Incidence of AIDS (16.6 new cases reported per 100,000 persons ages 13 and over 
in 1990) 
Smoking: Prevalence of cigarette smoking (28.8% of current smokers among persons ages 20 and 
older) 
Nutrition/physical activity: Prevalence of overweight (21% of persons ages 18 and over with body 
mass index greater than 27.8 kg/m(2) for men and 27.3 kg/m(2) for women) 
Workplace injury: Incidence of injuries resulting in medical treatment, lost time from work, or 
restricted work activity (8.1 cases per 100 full-time workers) 
Chronic disease screening: 
Breast cancer: Proportion of women ages 50 and over who received a clinical breast exam and a 
mammogram within the preceding year (19%) 
Serum cholesterol: Proportion of persons ages 18 and over who have ever had their blood 
cholesterol level checked (59% in 1988) 


Older Adults (Ages 65 and over) 
Activity restrictions: Proportion of the noninstitutionalized population age 65 and over with 
partial or complete limitation of major activity from chronic conditions (22.8% in 1989) 
Disabling injury: Incidence of hip fractures (714 hospital discharges per 100,000 persons age 65 
and over in 1988) 
Immunization: Proportion of persons ages 65 and over who receive an influenza vaccination in the 
preceding 12 months (34%) 
Dental health: Proportion of persons ages 65 and over who have lost all of their natural teeth (36% 
in 1986) 
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ties are especially great could be identified. To reflect the other two national 
goals, indicators were chosen to ensure that each age group includes measures 
that address mortality and morbidity, disability and quality of life, and access 
to preventive services. Some of the measures are intended to be com- 
prehensive measures for a particular age group; the death rate for cancer, heart 
disease, stroke, and diabetes combined for ages 25-64, for instance, is a 
comprehensive measure of preventable chronic disease mortality. Other mea- 
sures are sentinel in nature. For instance, breast cancer and serum cholesterol 
screening are indicators of the general availability and use of preventive 
services for adults. 

The proposed health status indicators incorporate measures of health status, 
risk reduction, and use of preventive services. Year 2000 objectives from 
Healthy People 2000 are used when possible. In some cases, other measures 
are proposed to match the age-based structure or the summary nature of the 
health status indicators. The indicators were also chosen to relate to as many 
priority areas as possible and to be available at the state and local levels. 

For infants, mortality is a standard summary measure of both mortality and 
access to health care. Low birth weight is included because it is both an 
important health status outcome in itself and a proxy for disability and a 
broader set of problems than infant mortality. Use of prenatal care appears as 
the critical access issue for infants. 

For children, the death rate from accidents, homicide, and suicide com- 
bined is a summary measure of mortality and a proxy for morbidity and 
disability from unintentional and intentional injuries. Together, these causes 
represent about half of the deaths in the age group. 

Immunization of children below school age is used as a key clinical 
preventive service measure. Lead exposure is included as a sentinel measure 
for toxic exposures in general, even though data may not be available on a 
regular, national basis, because it is one of the most important environmental 
toxins. 

The adolescents and young adults group covers a period of transition from 
parental to individual responsibility for health and safety and is one in which 
important high risk behaviors are, or are not, initiated. The death rate for 
accidents, homicide, and suicide combined is included because injuries are 
the leading cause of death and disability in this age group. As for children, the 
mortality rate is taken as a proxy for the disability caused by injuries. The 
teenage fertility rate is used as a key indicator of the future quality of life of 
both the mother and child. Unlike in Healthy People 2000, the fertility rate, 
rather (:an the pregnancy rate, is used because it is available from vital 
Statistics for even small areas. We identify the use of tobacco, alcohol, and 
cocaine because of their direct health consequences and because they are risk 
factors for both immediate threats to health and chronic disease later in life. 
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We also include sexually transmitted diseases because the incidence rate 
peaks for this age group. Gonorrhea serves as a sentinel measure because it is 
a reportable disease (other potentially important ones, such as pelvic in- 
flammatory disease, are not) and it is more common than syphilis. 

Among adults, the death rate from the four leading chronic diseases (heart 
disease, cancer, stroke, and diabetes) combined provides a summary measure 
of premature mortality in this age group. This mortality rate is also a proxy for 
disability caused by the same chronic conditions. AIDS incidence is also 
included with this age category because its effect peaks among adults. Smok- 
ing prevalence and obesity are included as risk factors for chronic disease. 
Obesity is a proxy for both nutrition and physical activity and is more easily 
measured through population surveys. Workplace injuries are also included in 
this group as a key factor in disability and because such injuries peak in this 
age group. Breast cancer screening and cholesterol screening are chosen as 
measures of the use of preventive services. Both are important in their own 
right and have more room for improvement than the more universally 
accepted Pap smears or hypertension screening. 

For older adults, both a general disability measure—limitations of major 
activities because of chronic conditions—and a major preventable cause of 
disability in this age range—hip fractures—are included. Loss of teeth is also 
an important cause of disability and has ramifications for such issues as 
nutrition and social isolation. Influenza vaccination is included as an indicator 
of access to and use of preventive services. No mortality measures are 
included for this age range, so we can focus on improving the quality of life, 
rather than simply prolonging it. 


Alternative Approaches 


Both the objectives in Healthy People 2000 and the proposed indicators come 
out of the public health tradition. Recent work on “health status assessment” 
by health services researchers and others concerned with assessing the out- 
comes of health care for policy and medical purpose may eventually be useful 
for public health assessment. 

According to Patrick & Bergner (32), “health-related” quality of life can be 
thought of in five concepts or domains: duration of life; impairments, such as 
subjective complaints, physical signs, self-reported disease, physiological 
measures, and medical diagnoses; functional status in physical, psycholog- 
ical, and social domains; health perceptions, including satisfaction with health 
and more general perceptions; and opportunity, including social or cultural 
handicaps and individual resilience. By using this framework, researchers 
have developed measurement tools to assess health status in general and for 
particular conditions. For instance, the RAND Medical Outcomes Study has 
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developed a comprehensive measurement tool with only 20 questions suitable 
for use in self-administered population surveys (39). More specific indices 
have been developed for several particular populations, diseases, and con- 
ditions (22). 

In health services research, the multidimensional nature of health status has 
been approached in at least two ways. First, weights for combining scores on 
multiple dimensions have been determined so that the aggregate score reflects 
the overall preferences of some reference population for tradeoffs between 
different disease and disability states. The Quality of Well Being (QWB) 
scale, for instance, is a preference weighted measure of symptoms and 
functioning that provides a score ranging from 0.0 for death to 1.0 for 
asymptomatic, optimum functioning (18). To weigh an individual’s risk 
factors, perhaps some variant of the statistical methodology used in health risk 
appraisals (8) could be developed. 

Another approach is to combine mortality and health status in a single 
index, often known as “Quality-Adjusted Life Years.” This is a cohort 
measure in which the life table proportion that survives to each age is 
multiplied by a weight, such as the average QWB score, that expresses the 
average “health status” of individuals who reach that age. Erickson and 
colleagues (11) describe how such a statistic can be calculated for national 
populations based on data from health interview surveys, and a version has 


been incorporated into objective 17.1 of Healthy People 2000. Kaplan & 
Anderson (17) have proposed a model for integrating risk factors and health 
status information over time to form a comprehensive measure of health 
related quality of life. Although theoretically promising, these approaches 
have not been applied on a national scale, but their use on a population basis 
deserves further development and testing. 


DATA FOR STATE AND LOCAL AREAS 


If Healthy People 2000 is to achieve its potential, communities of all sizes— 
states, counties, municipalities, and such groups as a company’s employees 
and their families—must adopt their own objectives and measure their prog- 
ress toward them through the 1990s (42, pp. 15-27). States and smaller 
communities, however, often find that data are unavailable or of poorer 
quality than national data. By assessing the ability of states to monitor the 
draft year 2000 objectives prepared in 1989, for instance, the Public Health 
Foundation found that on average states could only monitor 39% of the 
objectives. The proportion that could be monitored ranged from 27% to 58% 
across states (35). Data problems are more severe when information on racial, 
ethnic, and socioeconomic groups are needed. 

The problems of state and local health departments cannot be solved simply 
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by disaggregating national data. No national survey is likely to have a large 
enough sample to provide reliable direct estimates for all of the sub- 
populations required. Furthermore, current denominator data by race, ethnic- 
ity, and social-economic status are not generally available from the US 
Census Bureau. Rather than a single national survey, common survey meth- 
odology that can be replicated easily at the state and local levels, such as the 
Centers for Disease Control’s Behavioral Risk Factor Surveillance System 
(36), needs to be developed. 

Even when data are available for small geographical areas, as they are for 
vital statistics, the rates are unreliable because the events are infrequent. One 
approach to the sparse data problem is to use measures that are stable at the 
local level as proxies for the measures used in the national objectives. For 
instance, a local health department might choose to monitor infant health in 
terms of the proportion of low birth weight babies, rather than the infant 
mortality rate. Because the proportion of babies born with low birth weight is 
higher than the proportion that dies, this rate is more reliable for small areas. 
In choosing such proxy measures, however, it is important to verify that 
changes in the proposed measure truly reflect changes in the health 
characteristic to be monitored. 

Another approach is to use formal statistical methods designed for small 
areas. These are not yet commonly used in public health assessment, but are 


discussed below because they warrant further development. 


Statistical Models for Small Areas 


For measures that are too variable at the state or local levels, three, five, or 
more years of numerator data can be aggregated into one or a running series of 
calculated rates. Such measures are slower to show the impact of in- 
terventions because they include data from past years, but they may be stable 
enough to show meaningful trends. When rates change over time, aggregated 
rates are not comparable unless all of the rates are based on the same number 
of years. Thus, standards are needed to judge whether the variability of rates 
and measures is sufficiently small for tracking purposes and to ensure that the 
results are comparable within states and the nation. 

Kalton (16) has proposed four statistical models for small area estimation 
that have potential for public health assessment. “Synthetic estimation” uses 
information on the age, sex, and race distribution within a small area in 
combination with national race-, age-, and sex-specific rates of the outcome 
in question to estimate the prevalence in the small area. Elston and colleagues 
(10), for instance, have applied this approach to estimate the number of 
functionally dependent individuals for states and counties. “Regression es- 
timation” uses information from a sample of small areas with complete data 
on a continuous outcome variable, e.g. the maternal mortality rate, and other 
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generally available predictor variables to estimate a regression equation; these 
results then provide predicted values of the maternal mortality rate in other 
communities for which the predictor variables are available. “Structure pre- 
serving estimation” techniques use the methods of discrete data analysis, such 
as iterative proportional fitting (5), to combine survey-based information on 
the age and sex structure of an outcome, such as disability, with census 
information on the number of individuals in a community to estimate the 
prevalence of disability in a small community. “Composite estimation” com- 
bines information from the community in question (which might have a high 
degree of variability, depending on the size of the population) with a model- 
based estimate, such as those described above according to an empirical 
Bayes model (27). Manton and colleagues (23), for instance, describe the use 
of such a model to stabilize cancer mortality rates for counties in the US. 

As Kalton points out, all of these approaches depend on a statistical model, 
so the choice of a good model and effective auxiliary variables is important. 
Unless the auxiliary variables are strongly related to the outcome variable in 
question, the small area estimates will vary little from one area to another 
(16). In practice, the choice of the model and auxiliary variables is limited by 
the data available. Thus, although these approaches may be useful for health 
planners in predicting health care needs, they will be helpful for public health 
assessment purposes only if auxiliary variables are available to reflect changes 
over time and local differences from national levels accurately. 


Setting Targets and Priorities 


Although Healthy People 2000 presents numerical targets for most of the 
national objectives, it offers only limited information on how these targets 
were chosen. Progress reviews have no meaning if the targets are not reason- 
ably chosen and the rationale is not clearly explained to the public. When 
other groups set their own targets, they need guidance in determining what is 
achievable. 

The objective on coronary heart disease exemplifies the problem. The 
national objective calls for an annual coronary heart disease mortality rate of 
no more than 100 deaths per 100,000 population. This target implies a 2.3% 
annual decline, compared with the 3.0-3.1% annual decline (depending on 
the starting year) in the last two decades. The target is said to be “chosen on 
the basis of trend evaluation and expert judgment, and reflects the continuing 
downward trend in the overall coronary heart disease death rate” (46, p. 395), 
but Healthy People 2000 offers no reason for the deceleration. The target for 
this particular objective is particularly important; the difference between this 
target and one slightly more optimistic than the historical trend—90 per 
100,000—is equivalent to about seven months in life expectancy for the total 
population (43). 
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To set meaningful and feasible targets, state and local areas must consider 
differences from national values in baseline rates and trends in the measures in 
question, in addition to standardizing for population composition. Targets can 
also be set by comparison with other geographic areas or with epidemiological 
models that account for important risk factors in the population. 

There are several statistical methods that can help the working groups set 
meaningful numerical targets. None of these can be used on a strictly mechan- 
ical basis, and all require significant subject matter judgment. However, these 
methods can give some idea of what will probably happen in the absence of 
further interventions or indicate the likely impact of interventions on out- 
comes. Thus, models can help set or fine-tune the targets. 

The most straightforward statistical model is simple trend analysis. Such 
models can predict the level of various objective measures—assuming that 
current trends continue—as well as provide statistical confidence intervals. 
Objectives should usually be somewhat more favorable than what the trend 
analysis suggests will happen anyway. Projections should not be blindly 
accepted as the year 2000 target; rather, the target should be set higher or 
lower than the projected value, according to a subjective assessment of the 
progress that is possible (41). 

Models that identify the lowest possible morbidity and mortality rates that 
have been observed in specific groups could also be useful in setting targets. 
The specific groups could be other countries or geographic, racial, ethnic, or 
socioeconomic subpopulations of the United States. Woolsey (51), for in- 
stance, has proposed a version of this. Hahn and colleagues (14) have 
estimated the possible reduction in mortality rates that can be expected with 
the elimination of the most important risk factors for chronic disease. 

Mathematical models that relate health outcomes to specific interventions 
for many specific diseases and health behaviors can also be helpful in 
determining targets. These models provide insight into achievable health 
outcome levels and the relationship between the process and outcome objec- 
tives. For instance, the National Cancer Institute has developed a model to 
project cancer incidence and mortality under various cancer control programs, 
such as prevention programs, screening, and treatment (20). Such models 
require more data than simple trend analyses and take time to develop and 
verify. In addition, there can be substantial uncertainties in modeling the 
interventions and interactions among them. The modeling process itself, 
however, helps focus discussion and thinking and leads to a range of plausible 
targets. Similar models have been, or are being, developed for cardiovascular 
disease, AIDS, and other diseases (49). By using such models as appropriate, 
Closing the Gap synthesizes much of what is known about the potential health 
effects of health promotion and disease prevention (4). 

Simple extrapolation models and process models, such as the one for 
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cancer, form two extremes of a spectrum. Extrapolation models that consider 
age-period-cohort effects, projected demographic changes, and other factors 
(7) fall between the two and offer some promise. 


IMPLEMENTATION 


The national objectives in Healthy People 2000 provide both the motivation 
and a good starting point for the development of public health assessment 
efforts at the national, state, and local levels. However, substantially more 
effort is needed in the 1990s if we are to achieve the promise of these 
objectives in the year 2000. 

Because states and local areas are expected to develop objectives of their 
own that parallel the national objectives, Healthy People 2000 itself is 
actually the first step in specifying a national public health data set. Recogniz- 
ing this, the objectives and targets used in the national objectives must be 
documented to communicate to public health statisticians at all geographical 
levels the assumptions and methods used at the national level. 

At the national level, documentation is needed on the technical characteris- 
tics of data sources used in Healthy People 2000; the detailed identifiers 
needed to adapt these data for the purposes of the objectives, such as ICD 
codes and hospital discharge codes; and the procedural definitions, such as the 
calculation of “body mass index” and the appropriate age- and sex-specific 
reference values. 

For states, counties, and smaller localities, additional information is needed 
to implement comparable surveillance systems for the objectives. Model 
survey questions and survey methodologies for state and local use must be 
developed and disseminated. Careful consideration must also be given to 
identifying appropriate proxy measures and alternative methodologies for 
state and local level assessment, where events are infrequent and rates would 
be highly variable or where disaggregated geographical data are not available. 

When data are available, methods are needed to translate the national 
numerical targets to the state and local levels. Such methods as age- 
adjustment and synthetic estimation should be explored. To determine feasi- 

le local targets, common methods for trend analysis, risk factor control 
analysis, and comparison with other geographic areas must be developed and 
made available. Modeled after the efforts of the European Regional Office of 
the World Health Organization (52), computer software could be developed to 
graph, analyze, and project state or local trends in conjunction with national 
data. 

Because of its leadership role in health statistics, the federal government 
can help improve the surveillance tools needed to assess progress on the 
objectives and to set priorities for future actions. The Public Health Service 
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acknowledges the need for strengthened public health assessment efforts and 
devotes Chapter 22 of Healthy People 2000 to surveillance and data systems. 
Meeting these surveillance objectives will require vigorous national leader- 
ship and the collaboration of numerous federal agencies, state and local health 
agencies, and many private sector organizations. Because assessment is 
fundamental to the objectives process and public health in general, these 
activities surely deserve vigorous support. 
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INTRODUCTION 


Although little is known in the United States about hemorrhagic fever with 
renal syndrome (HFRS), this disease is a significant cause of human morbid- 
ity and mortality across Eurasia. The severest form of HFRS is most prevalent 
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in Asia, where more than 150,000 cases occur annually (52), with consistent 
mortality rates of about 5%. The disease has been given a multitude of names 
(22), which has contributed to the confusion concerning its actual distribution 
and epidemiology. In China, the disease is known as epidemic hemorrhagic 
fever; in Korea, it is called Korean hemorrhagic fever. In Scandinavia, 
western USSR, and western Europe, a milder form of this disease occurs, 
with case-mortality rates of less than 1%. This form of HFRS is called 
nephropathia epidemica. The World Health Organization adopted the term 
“hemorrhagic fever with renal syndrome” to serve as a unifying name for 
these and related conditions (21). 

This group of diseases is caused by a newly recognized group of viruses, 
the genus Hantavirus, of the family Bunyaviridae. These viruses are main- 
tained in nature primarily in rodents of the superfamily Muroidea, which 
includes the common species of rats (Rattus spp.), house mice (Mus muscu- 
lus), field or woodland mice (Apodemus spp.), and voles (Microtus and 
Clethrionomys spp.). Although other mammals are occasionally infected with 
these viruses, in each geographic region where distinctive forms of HFRS 
occur, each hantavirus is primarily associated with a single species of rodent 
(Table 1). Unlike other bunyaviruses that cause disease in humans (such as 
California encephalitis and Sandfly fever viruses), arthropod vectors are 
believed to play a negligible role in the transmission of hantaviruses. Human 
infection results from inhalation or contact with virus excreted or secreted in 
rodent urine, saliva, or feces. Rodent bite has rarely been implicated in human 
infection (17), although this route of transmission may play a role in rodent- 


Table 1 Members of the genus Hantavirus (Bunyaviridae), their rodent reservoirs, geographical 
distribution, and associated human disease(s) 








Virus Primary rodent host Distribution Disease 





Hantaan China, Korea, eastern 
USSR, 


Balkans 


Apodemus agrarius 
(Striped field mouse) 
Apodemus flavicollis 
(Yellow-necked mouse) 
Rattus norvegicus 


Epidemic hemorrhagic fever, 
Korean hemorrhagic fever 

Hemorrhagic fever with renal 
syndrome (severe) 

Epidemic hemorrhagic fever* 
(mild type) 


Seoul Worldwide 
(Common or brown rat) 


Puumala 


Prospect 
Hill 
Leakey” 


Clethrionomy glareolus 
(Bank vole) 


Microtus pennsylvanicus 
(Meadow vole) 

Mus musculus 
(House mouse) 


Scandinavia, western 
USSR, eastern Eu- 
rope 

United States (Mary- 
land, Minnesota) 

United States (Texas) 


Nephropathia epidemica 


None known 


None known 





* Disease associated with rat-borne hantaviruses is documented in Asia only. 
> Proposed new virus. 
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to-rodent transmission (25). Humans do not excrete or secrete large amounts 
of virus during infection, and are thus unlikely to transmit the virus to other 
humans or rodents. 

These viruses are of special interest because they occur in the United 
States, as well as foreign countries. At present, three differnt hantaviruses 
have been identified in the US; of these, the Seoul virus of Norway rats 
(Rattus norvegicus), is clearly associated with human disease in other countr- 
ies. Residents of major cities in the US, especially those dwelling in the inner 
cities, are routinely exposed to rats infected with hantaviruses and may, in 
turn, become infected. There is increasing evidence that infection may cause 
an acute disease and predispose an individual to subsequent development of 
chronic renal disease or hypertension or increased risk of cerebrovascular 
accident (26). Thus, the hantaviruses represent an emerging viral disease that 
offers new challenges and, perhaps, explains some long-standing questions. 


HFRS, HANTAVIRUSES, AND RODENT HOSTS 


Hantaviruses and the diseases they cause can best be understood by examining 
the individual viruses and the ecological relationships to their rodent hosts. 


Korean Hemorrhagic Fever and Hantaan Virus 


Western physicians and scientists first became aware of human disease due to 
hantaviral infections during the Korean conflict, when more than 2400 United 
Nations forces were infected with a mysterious “new” disease, then called 
Korean hemorhagic fever (18, 73, 75). Although this disease was new to 
Western science, it was not new to the region: Japanese and Russian physi- 
cians had described an identical disease in Manchuria during the 1930s and 
1940s, when the Japanese lost thousands of troops to the illness (75). The 
Allied Forces in Korea established the Hemorrhagic Fever Commission to 
investigate the disease; but, in spite of a massive effort, the causative agent, 
Hantaan virus, was not isolated until 1976. 

Lee and colleagues (60, 62), working at Korea University in Seoul, were 
the first to detect and then isolate the etiological agent of Korean hemorrhagic 
fever. An antigen that reacted with convalescent sera from Korean 
hemorrhagic fever patients was found in the lungs of the striped field mouse, 
Apodemus agrarius (60). A virus was subsequently isolated from lung tissue 
from the same species and named “Hantaan virus” in recognition of the 
Hantaan River, which flows through the endemic region of Korea (62). 
Shortly thereafter, Hantaan virus was adapted to growth in cell culture (20), 
and an immunofluorescent antibody (IFA) test was developed. The availabil- 
ity of antigen and an assay allowed for rapid progress in defining the distribu- 
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Table 2 Chronology of selected events in the study of hemorrhagic fever with renal 
syndrome 








Approximate 
Event Dates 





Clinical descriptions of HFRS in Asia and Europe 1930s—1950s 
Isolation of Hantaan virus 1976 

Cell culture adaptation 1981 

Relatedness of agents causing KHF, EHF, and NE demonstrated 1980-1983 
Global distribution of infected rodents 1982-1987 
Grouping of agents in genus Hantavirus family Bunyaviridae 1983-1985 
Widespread human infection and disease 1983-1988 


Association with chronic disease 1990 





tion and epidemiology of HFRS. Within five years, the diverse clinical 
entities, which occurred across Asia and Europe under several different 
names, were shown to be caused by viruses related to Hantaan virus (Table 
2). 

Extensive surveys of potential animal reservoirs in Korea, as well as 
investigations into the possible infection of ectoparasites associated with these 
species, failed to reveal additional hosts for Hantaan virus other than A. 
agrarius (61). Experimental infections of this natural rodent host revealed an 
unusual characteristic of Hantaan virus and, as it turned out, all hantaviruses. 
After A. agrarius were inoculated with Hantaan virus, a brief viremia de- 
veloped in the animals at seven to ten days postinfection. Subsequently, viral 
antigen was detectable in many organs of the animals, including the kidney 
and lungs; virus was shed in their saliva, feces, and urine (57). Shedding of 
infectious virus persisted despite the presence of antibody capable of 
neutralizing Hantaan virus in in vitro assays (Figure |). The persistent infec- 
tion had no apparent detrimental effects on the host, and the rodent became a 
carrier of infectious virus, in a relationship similar to that previously de- 
scribed for arenavirus-rodent infections. Virus was persistently shed into the 
environment through contaminated urine and feces for periods of time that 
were probably equivalent to the natural lifespan of these rodents in the field. 
Investigators now suspect that excreted, infectious virus in urine and feces 
from infected rodents is the source of virtually all human infections. This also 
holds true for other hantaviruses discussed below (45, 93, 99). 

The epidemiology of the hantaviruses and HFRS is intimately linked to the 
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Figure 1 The course of infection and infectivity of Hantaan virus in Apodemus agrarius. 
Shaded area of bars for lung, parotid glands, and kidneys indicates presence of antigen, but not 
virus. Similar chronic infection and persistent shedding of virus is thought to occur in most 
primary hosts of hantaviruses. (From Ref. 57. Reproduced with permission.) 


biology of the rodent host. Natural population cycles of rodents and certain 
human behaviors result in distinctive seasonal patterns of HFRS. In Asia, the 
disease is most common in late fall and early winter (51). These are the 
periods during which field mouse populations are maximal and the harvesting 
of crops places farmers in rodent-infested environments. The majority of 
HFRS cases are in the adult male population, as men are most likely to come 
into contact with rodents as part of their farming practices. 

What is now called “classic” HFRS due to Hantaan virus in Asia is 
characterized by a mild to severe disease, which follows an incubation period 
of two to three weeks, with a range of 5 to 42 days (75, 92). The disease 
progresses in relatively well-defined successive stages (in chronological 
order): febrile, hypotensive, oliguric, diuretic, and convalescent (8, 74, 75). 
Convalescence can take two to three months. The major clinical signs and 
symptoms seen in each stage are fever, shock, renal impairment, relative 
hypovolemia, and fluid and electrolyte imbalance. A petechial rash may also 
be present, and a characteristic facial flushing develops in many patients. 
Among the more severely ill patients, hemorrhagic signs, including scleral 
injection, ecchymosis, and gastrointestinal bleeding, may be seen (75). In 
fatal cases, death is usually a consequence of shock or renal failure and occurs 
most frequently during the oliguric stage of disease. Recovery from infection 
was believed to be complete (54, 75); however, reports of chronic renal 
impairment exist (15, 42, 44, 78), and recent observations from our own 
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studies show an association between past infection with a hantavirus and 
chronic renal disease (26). 

Treatment of HFRS has traditionally been limited to careful supportive 
care, with special attention to fluid balance augmented by renal dialysis in the 
most critically ill patients (59). Recently, a double-blind, placebo-controlled 
efficacy trial of the antiviral drug, ribavirin, was conducted in Wuhan, China 
(31). This 1986-1988 collaborative study was held during the course of two 
transmission seasons. Of the 108 patients receiving placebo, ten deaths were 
recorded. However, only three of the 123 treated patients died (31). This 
significant difference in mortality indicated that ribavirin may be useful in the 
treatment of acute HFRS, especially if administered early in the course of 
disease. The drug also markedly reduced morbidity and frequently decreased 
the time patients spent in each stage of illness. 


Severe HFRS of the Balkans 


Recently, an extremely severe type of HFRS was recognized in the mountain- 
ous regions of Greece (3, 4, 46), Albania, parts of Yugoslavia (5, 28), and 
Bulgaria. This disease is far more severe than nephropathia epidemica seen 
elsewhere in Europe and more closely resembles the Asian form of HFRS. 
Although relatively few cases have been recorded, the mortality rate appears 
to be even greater than in Asia; preliminary reports suggest that 15-30% of 
patients hospitalized with this disease may succumb to their infection. The 
causative agent, originally named Porogia virus, is closely related to Hantaan 
virus (3). It is also maintained by a small rodent, the yellow-necked mouse, 
Apodemus flavicollis (27). Unlike HFRS in Asia and other parts of Europe, 
cases in the Balkans peak during the warmer months of the year, as most cases 
occur around August. 

The origin of this severe form of HFRS is currently unknown, although the 
relatively wide geographic distribution of the virus throughout the Balkan 
Region, and its localization in sparsely populated rural habitats, suggests that 
human commerce was probably not involved in determining its present 
distribution. Nonetheless, serological analyses clearly indicate a very close 
relationship between isolates obtained from rodents and humans in the Bal- 
kans and Asian isolates of prototype Hantaan virus. Complete sequence 
information is not available for any Balkan isolate, but examination of small 
genomic sections of the M segment (365 base pairs), amplified by polymerase 
chain reaction (PCR) and digested by restriction endonucleases, suggests that 
at least the segment examined is highly conserved among both Balkan and 
Asian isolates (S.-Y. Xiao 1991, unpublished observations). 


Nephropathia Epidemica and Puumala Virus 


A less severe form of HFRS, with a case-mortality rate of less than 1%, is 
found in Scandinavia, western USSR, and other European countries (42, 43, 
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88). Although nephropathia epidemica was first described in Scandinavia in 
the 1930s, it is only now emerging as a relatively common cause of acute 
renal failure in France (35, 76), Italy (79), Belgium (96), and other European 
countries (36, 95, 97). As with Hantaan virus, the virus that causes nephropa- 
thia epidemica, Puumala virus, was only recently discovered and isolated. 

Puumala virus is maintained in nature by the bank vole, Clethrionomys 
glareolus (7), which is widely distributed in western Europe and western 
USSR; its range overlaps that of this milder form of HFRS (Figure 2). The 
course of viral infection in the vole host is similar to that described for 
Hantaan, with long-term, somewhat sporadic, shedding of virus in feces and 
urine (102). The disease in humans is highly seasonal; most cases occur in the 
late fall and early winter. A correlation between high rodent numbers and 
increased annual incidence of disease has been shown in Sweden for the years 
1961-1974 (70). In Sweden, for reasons which are unclear, nephropathia 
epidemica is largely limited to the northern half of the country: Infection in 
voles mirrors the distribution of human disease, although the same vole 
species occurs throughout the country (68). Again, adult men living in rural 
locations and engaged in outdoor employment, such as forestry workers, 
comprise the majority of cases (38). The vole host of viruses that cause 
nephropathia epidemica also moves into storage buildings and homes during 
cold weather; thus, indoor exposure can occur (70). 

Although the disease spectrum of nephropathia epidemica is similar to that 
seen for Far Eastern forms of HFRS, some of the clinical signs or symptoms 
are less prominent or occur in fewer patients (43). Renal dysfunction is still 
the prominent clinical characteristic, but anuria is rare. Concentrating capac- 
ity of the kidney may be impaired from weeks to months, but serious 
hemorrhagic manifestations and mortality are generally absent (43). 


Seoul Virus and Urban Rats 


Certainly, the most significant event in the recent history of HFRS research, 
from the global public health perspective, was the recognition of HFRS 
among urban residents of Seoul, Korea, and the subsequent isolation of Seoul 
virus from city rats (56). Seoul virus was first discovered by Lee and 
colleagues during their investigations of Hantaan virus (56). They were 
intrigued by cases of HFRS that occurred in urban residents with no history of 
rural travel and no exposure to A. agrarius. These patients suffered from a 
disease indistinguishable from the milder forms of HFRS due to Hantaan 
virus, and they developed antibodies reactive with Hantaan virus in the IFA 
test. When rodents were captured in and around these patients’ homes, no A. 
agrarius were trapped. However, both R. norvegicus and R. rattus, the black 
or roof rat, were present; when examined, they also possessed antibodies 
reactive by IFA tests with Hantaan virus. Later, through use of cross- 
neutralization tests, researchers discovered that the virus infecting rats and 
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Global Distribution of Rodent Reservoirs of HFRS 


Distribution of Clethrionomys glareolus 


Figure 2 Distribution of Apodemus agrarius and Clethrionomys glareolus, primary hosts of 
Hantaan and Puumala viruses, respectively. 


urban residents was antigenically distinct from Hantaan; this virus was named 
Seoul virus. 


After the discovery of Seoul virus in urban rats in Korea, the possibility of 
potential international dissemination of this zoonosis via the shipping industry 
was suggested. With the exception of Antarctica, Norway and black rats are 
now widely distributed on all continents, where they have been introduced 
from Europe and Asia. Within a few years, infected Norway rat were identi- 
fied in the United States within the port cities of Philadelphia and Houston 
(49), New Orleans (94), and Baltimore (13). Viruses similar or identical to 
Seoul virus were isolated from rats captured in each of these locations, and the 
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search for infected rats was expanded to a global scale. Serological surveys 
documented the presence of infected rats in many parts of Asia, Europe, 
Africa, and South America (48). Detailed surveys for human infection and 
disease are not available from most of these locations. 

Extensive investigations were initiated to study this Seoul-like virus in 
inner-city Baltimore rats. Infected rats were found widely distributed through- 
out the city (14), but were especially abundant in the lower-income neighbor- 
hoods, in which accumulated trash, litter, and garbage provided ideal con- 
ditions for rat infestations (13). Infections had been stably enzootic within the 
rats of Baltimore for at least ten years (J. E. Childs and G. E. Glass, 
unpublished data), based on long-term trapping in specific alleys. In addition, 
Norway rats do not move great distances in urban environments; therefore, 
widespread dissemination of a virus, without an arthropod vector, generally 
takes many years. Thus, Seoul-like viruses were not recent introductions to 
the US, but had been here for some time and had become widely dis- 
seminated. Rat populations of several areas were followed longitudinally over 
two years, and we were able to demonstrate that virus was transmitted among 
rats throughout the year. About 11% of the population became infected per 
month (11). As older cohorts of rats were examined, their antibody preva- 
lence rates increased until virtually all of the oldest rats had become infected 
(Figure 3). 

The infection of laboratory rat colonies and tissue lines derived from rats 
also pose a public health threat to certain occupational groups. Outbreaks of 
laboratory rat-associated HFRS have been documented in Korea (58), Japan 
(30), and the Soviet Union (41); sporadic cases have occurred in Belgium (16) 
and the United Kingdom (65). Generally, these infections are limited to 
animal handlers or laboratory personnel. In some countries, infection rates of 
laboratory rat colonies can be high and widespread among research facilities. 
In the US, there is no evidence that commercial or laboratory colonies of rats 
are infected with Seoul-like viruses (45), and the widely practiced methods of 
barrier breeding and cesarean derivation may reduce the risk of introducing 
and maintaining infection within colonies (45). Routine screening of labora- 
tory animals for viral pathogens now usually includes serological tests for 
Hantaan (39). 

Cell lines derived from tumors of rats can be infected with a hantavirus, and 
were the source implicated in a recent human infection (66). Again, there is 
no evidence that commercially available cell lines are contaminated in the US 
(50). 


Prospect Hill Virus and Meadow Voles 


Prospect Hill virus was isolated from meadow voles (Microtus pennsylvani- 
cus) captured in Frederick, Maryland, in the early 1980s (63). This virus is 
antigenically distinct from other mouse, rat, or vole hantaviruses and co- 
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Figure 3 Relationship between antibody prevalence, as determined by IFA, and body mass for 
525 Norway rats captured in Baltimore, 1980-1985. Rats were separated by sex, then grouped 
into 100 g mass classes. (From Ref. 13. Reproduced with permission.) 


circulates in space and time with Seoul-like viruses in small mammal com- 
munities in the US (37). Typically, the prevalence of Prospect Hill viral 
infection in voles is around 20-25% and increases in older cohorts of voles 
(12, 103). This virus is not currently considered a public health problem. 
Specific neutralizing antibodies to this hantavirus have been detected in sera 
collected from professional mammalogists (104), who constitute a high-risk 
group for exposure to the reservoir. The infected individuals had no recollec- 
tion of an illness compatible with HFRS. 


Leakey Virus and House Mice 


One of the most recently proposed additions to the Hantavirus genus is a virus 
isolated from the common house mouse (Mus musculus) in Texas (6). This 
virus can be differentiated from other hantaviruses by serological techniques. 
The public health relevance of this agent is unresolved; however, Baek et al 
(6) have reported human disease associated with serological evidence of 
Leakey virus infection. The geographic distribution, host specificity, and 
prevalence of this virus has not been adequately established for any region. 
We speculate that some of the hantaviral antibody found in house mice in 
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recent surveys by IFA, and shown to be non-neutralizing for Hantaan and 
Seoul viruses, may be directed against Leakey virus. 


MOLECULAR BIOLOGY, DIAGNOSTIC TESTS, AND 
VACCINE DEVELOPMENT 


In conjunction with progress made in our understanding of the epidemiology 
of the hantaviruses, has come detailed molecular characterization of this new 
group of viruses. Some of the molecular techniques have provided powerful 
new tools to aid in the diagnosis (86, 98) and epidemiological study of HFRS 
and promise to provide novel vaccine candidates. 

Hantaan and related viruses were placed in the family Bunyaviridae on the 
basis of physical characteristics and genetic determinations (32, 33, 82-84). 
These viruses show the characteristic tripartite RNA genome of other 
bunyaviruses, but were placed in a new genus by virtue of their unique 3' 
terminal sequences (83). The three segments, designated S, M, and L for 
small, medium, and large, code for nucleocapsid protein (85, 89), envelope 
glycoproteins (24, 72), and a presumed transcriptase (80), respectively. 
Various strains of hantaviruses have been cloned and sequenced (1, 2, 24, 34, 
71, 72, 83-85, 87, 89), and both the surface glycoproteins and the core 
nucleocapsid protein have been expressed in different systems (81). These 
expressed proteins may be produced in large, relatively pure quantities and 
may eventually replace cell culture derived antigens for use in diagnostic tests 
for HFRS (77, 81). 

Currently, definitive diagnosis of HFRS is relatively slow, as it depends on 
clinical presentation, coupled with appropriate serological tests and, in rare 
cases, virus isolation. Standard serological tests to confirm a diagnosis of 
HFRS are mostly based on immunofluorescent or enzyme-linked im- 
munosorbent assays to detect immunoglobulin G (29, 64, 67, 69, 90). 
However, for epidemiological monitoring in regions where more than one 
hantavirus circulates, positive identification of the infecting agent may de- 
pend on cumbersome cross-plaque-reduction neutralization assays of serum 
performed on batteries of hantaviruses under laboratory containment con- 
ditions (91). 

Serological tests for immunoglobulin M (IgM) antibody specific for hanta- 
viruses are useful aids in diagnosing acute disease (47). Immunoglobulin M 
antibody is present very early during the course of the disease. In a study in 
Wuhan, China, the majority of patients had IgM on admission to the hospital 
(10). Based on limited study, other hantaviral infections show a similar 
pattern; thus, measuring specific anti-hantaviral IgM antibodies is the method 
of choice for diagnosis of acute HFRS. 

Virus isolation is a difficult task with hantaviral infections. The advent of 
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PCR technology, and its adaptation to RNA viruses, may make it possible to 
diagnose and identify the type of hantavirus within a single day (98). Both M 
segment and S segment PCR tests are currently being investigated, and 
universal primers that bind to and amplify more than 20 strains of hantavirus 
have been identified. After amplification, restriction enzymes can be em- 
ployed to differentiate isolates rapidly. Such analyses have shown tremendous 
power to group hantaviruses, and these methods hold great potential for future 
molecular epidemiological studies. 

In addition to cloning sequences of hantaviral genomes for diagnostic 
purposes, researchers at the United States Army Medical Research Institute of 
Infectious Diseases have constructed an engineered vaccine for Hantaan virus 
by using vaccinia virus as a vector. Phase | testing of candidate vaccine 
constructs is planned for the near future. Currently, suckling mouse and 
suckling rat brain vaccines are available, or being tested, in North and South 
Korea (55); other inactivated vaccines are being developed (100). 


HFRS IN THE UNITED STATES 


Shortly after the recognition that hantaviral infections occurred in rats and 
other rodents in the US, seroepidemiological surveys were undertaken to 
determine the extent of human exposure and to search for evidence of human 
disease. Early studies showed that human infection occurred among shipyard 
workers (94), mammalogists (104), dialysis patients, and laboratory workers 
(23), as well as other populations. However, as no acute illness consistent 
with the then known presentation of HFRS was observed, investigators 
suggested that HFRS in the US may be a mild or atypical disease, or that the 
viruses present on this continent may be nonpathogenic (101). A recent 
survey of forestry workers and others with outdoor occupations suggests that 
infection from Prospect Hill-like viruses is rare (19). 

However, the abundance of hantaviral infections among inner-city rats, 
their coexistence with the resident human population, and the recognition in 
Asia that this virus causes acute illness led us to search for human disease 
among the inner-city residents of Baltimore. Studies conducted in the inter- 
vening years since the initial identification of hantaviruses in the US had 
shown that HFRS due to rat-borne viruses in Asia caused an illness that was 
less severe in its hemorrhagic and renal manifestations than that caused by 
Hantaan virus. The disease associated with rat-borne viruses also showed a 
higher degree of hepatic involvement (9, 53). 

We first conducted a serosurvey of more than 2000 persons who visited a 
venereal disease clinic, which was located in inner-city Baltimore. The 
population sampled was predominantly composed of black men in their 
mid-20s who were of lower socio-economic status. Several individuals in this 
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group were found with antibody specific for the Baltimore strain of Seoul 
virus, which yielded an antibody prevalence rate of 2.4 per thousand. All of 
these individuals were born in Baltimore and resided in areas known to have 
infected rats; none had traveled outside of the US. These results indicated a 
low rate of indigenous exposure to hantaviruses within the city. 

We next examined sera from more than 4500 patients seen at the Johns 
Hopkins Hospital. Patients with elevated proteinuria were selected for ex- 
amination, as this has been a consistent laboratory finding in all forms of 
HFRS, regardless of the infecting virus (42). This population was also drawn 
primarily (> 75%) from inner-city Baltimore, where infected rats were 
common. This group was primarily composed of black women in their 
mid-40s who were also from lower socio-economic neighborhoods. The 
antibody prevalence rate at 12 per thousand was fivefold higher than that seen 
at the venereal disease clinic. As a control population, individuals using the 
Johns Hopkins Hospital emergency room were examined for exposure to 
hantaviruses. Age-corrected seroprevalence in the emergency room popula- 
tion did not differ significantly from the sexually transmitted disease clinic, 
but was 1.5-3.2 times lower than in the proteinuria group. 

Most of the seropositive patients did not have IgM antibodies, and sequen- 
tial samples failed to show rising titers, thus suggesting that these were 
previous infections. However, five patients showed either seroconversion, 
rising neutralizing titers, or elevated IgM titers and neutralizing titers, which 
indicated recent exposure. These patients consistently had nausea, vomiting, 
epigastric pain, and low-grade fever. Laboratory findings indicated renal and 
liver involvement, as measured by elevated BUN, serum creatinine, AST, 
ALT, total bilirubin, and LDH (G. E. Glass, A. J. Watson, and J. E. Childs 
1991, unpublished information). Thrombocytopenia occurred in some 
patients, and pleural effusion was also observed. Hemorrhagic manifestations 
were rare and mild, although blood in the sputum and melena were reported. 
Thus, illness associated with changes in hantaviral antibody status was similar 
to reports of mild HFRS due to rat-borne hantaviruses from the Far East and 
Europe (9). This suggests that HFRS occurs in US populations exposed to 
infected rats, but the relative rarity of severe, acute illness may preclude its 
recognition. Among the Johns Hopkins group, the acute illness resolved 
spontaneously with supportive care. However, at least one patient continued 
to show evidence of chronic renal insufficiency 13 months after the illness, 
and the potential for sequelae from HFRS infection was of concern. 

Many of the Johns Hopkins patients with antibodies to a hantavirus suffered 
some form of chronic disease. We matched each seropositive person (N= 15) 
by age and sex to five seronegative controls from the same patient population, 
and found that the seropositive group was significantly more likely to suffer 
from a specific form of chronic renal disease, hypertension, or a history of 
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stroke than were seronegative individuals with proteinuria (26). The associa- 
tion was specific for those conditions that could conceivably be linked to past 
kidney disease, whereas the rates of other chronic illnesses, such as diabetes, 
did not differ significantly. The differences could not be explained on the 
basis of race, residence, occupation, age, or sex. 

Patients’ charts were reviewed by a nephrologist, who lacked prior knowl- 
edge of the patients’ serological status, for the primary diagnosis underlying 
their renal disease. Hypertensive nephrosclerosis was the most common 
diagnosis among the seropositive group (70%), and these findings differed 
significantly from those of the matched seronegative controls (9%), among 
whom diabetes mellitus was the most common cause of renal disease (50%). 
Other factors, such as drug abuse, polycystic disease, and glomerulonephritis, 
were secondary causes of renal dysfunction among the seronegative control 
group. 

Finally, we examined more than 400 patients enrolled in a chronic renal 
dialysis program in Baltimore. The population age and sex distribution was 
similar to that of the Johns Hopkins Hospital sample. The rate of seropositiv- 
ity for this group was 20 per 1000, the highest of any group sampled. 

Encouraged by the discovery of an apparent association between seroposi- 
tivity to hantaviruses and some forms of chronic disease, we encouraged our 
collaborators in Europe to consider the possibility of chronic renal damage 
among the patients they had diagnosed. Most initially felt that there was no 
significant sequelae after HFRS. When they specifically looked, however, 
they found that about 10% of their hospitalized patients left with some 
evidence of persistent renal dysfunction (A. Antoniadis and J. W. LeDuc 
1990, unpublished observation). Few of these patients have been systemati- 
cally followed over time, and those that have were only followed for a few 
years. Even so, several continue to have a demonstrable inability to con- 
centrate urine, other indications of possible permanent kidney damage, or 
essential hypertension. We are actively investigating this facet of the disease 
to determine how frequently chronic disease follows HFRS. 

These studies indicate that HFRS does occur within the US and produces an 
illness similar to that seen in other parts of the world where disease is due to 
rat-borne hantaviruses. They also suggest that infection may produce serious 
sequelae among populations that are exposed to rats in this country. 

The possibility of long-term renal dysfunction after a hantaviral infection 
has not been well studied. One of the few studies examined Korean Conflict 
veterans who had suffered hantaviral infections and a group of matched 
controls in 1956, about three to five years after most cases would have been 
infected (78). This study found a significant increase in the rate of genitouri- 
nary hospital admissions among the HFRS cases, which increased with the 
severity of their original disease. Other findings included hyposthenuria, 
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persistent mild albuminuria, and hypertension. Similarly, Lahdevirta (42, 44) 
examined patients with nephropathia epidemica 1-6.5 years after disease and 
found evidence of depressed renal tubular function and hypertension in some 
of these patients. Hypertension was especially common in his follow-up 
population; nearly 75% of the individuals were hypertensive. These in- 
dications of chronic renal dysfunction and hypertension are consistent with 
the hypothesis that their condition could evolve over time to condition similar 
to those seen among the Baltimore residents. 

A planned experiment is to reexamine the Korean Conflict veterans now, 
nearly 40 years after their initial disease, to determine their present level of 
kidney function. The staff of the Medical Follow Up Agency, in collaboration 
with our laboratory, has initiated plans to conduct such a study, and those 
results may be available in the near future. Fortunately, the records from the 
original study by Rubini et al (78) are still intact at the Follow Up Agency, 
and we should be able to make some very interesting comparisons. 

Providing the proof that infection with a hantavirus may predispose an 
individual to hypertension and chronic renal sequelae is of great relevance. In 
excess of $50 billion a year is spent in the US for medical care for kidney and 
urologic disease. About $3 billion of that total is for federal Medicare 
payments for dialysis and transplantation for persons with end-stage kidney 
disease (40). This total is growing, while the funds available for medical care, 
in general, are shrinking. If even a small portion of this burden is a conse- 
quence of past hantaviral infections, for example the 2% that we found in our 
Baltimore dialysis units, then we as a nation are spending about $100 million 
a year on a condition with opportunities for prevention, control, and clinical 
intervention. Clearly, this emerging disease requires our immediate attention 
and action. 


CONCLUDING COMMENTS 


The past decade has witnessed a tremendous growth in our knowledge 
regarding the epidemiology, molecular biology, and diseases caused by the 
hantaviruses. We now know that several different viruses are capable of 
causing clinically similar diseases. These viruses are maintained in nature 
primarily within rodent reservoirs, in which they cause chronic infections 
with persistent viral shedding in secretions or excretions. The hantaviruses are 
distributed far more widely than once suspected. The diagnosis of acute 
HFRS can now be made rapidly and accurately, and ribavirin, an antiviral 
drug, appears to be efficacious in its treatment, if given early in the course of 
disease. The molecular characterization of these viruses has progressed rapid- 
ly and holds the promise of novel vaccine design and manufacture in the near 
future. Finally, preliminary evidence suggests that past hantaviral infection 
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may be associated with subsequent development of chronic renal disease, a 
phenomenon that may have considerable domestic and global public health 
and economic implications. 
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INTRODUCTION 


Physical Activity from Prehistory to the Present 


Remains of our early, human-like ancestors, Australopithecus afarensis, have 
been dated as 3.5—3.8 million years old. Nearly 4 million years of evolution 
of the human family, Hominidae, produced modern humans, H. sapiens, by 
approximately 35,000 years ago (71). The earliest hominids were scavengers; 
but, by about | million years ago, hunting and gathering was firmly es- 
tablished as a way of life for human beings. A hunting and gathering lifestyle 
involves high energy expenditure for several days a week, with peak bouts of 
strenuous physical activity (26, 93). 

The next major change in human sociocultural development was the 
domestication of plants and animals and the rise of agriculture, which oc- 
curred only 10,000 years ago. Industrialization advances over the past 200 
years led to further urbanization and the rise of the middle-class. But, even 
during this period, most individuals had relatively high energy expenditures 
compared with those of society at the end of the twentieth century. 

Human energy expenditure requirements have declined over the twentieth 
century, a trend that has apparently accelerated during the technological era 
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following World War II (92). Increased automotive transportation, wide- 
spread adoption of sedentary activities, and labor-saving devices are major 
contributors to the decline in energy expenditure by individuals. The metabol- 
ic energy demands of previously strenuous jobs, such as working as a 
longshoreman or coal miner, are much lower today than in the past because of 
containerization, mechanization, and automation. 

Humans evolved to be active animals and may not be able to adapt well to 
the modern sedentary lifestyle. This point is well stated by Eaton et al (27): 
“From a genetic standpoint, humans living today are Stone Age hunter- 
gatherers displaced through time to a world that differs from that for which 
our genetic constitution was selected.” This teleological argument of human 
genetic selection and the need for physical activity does not prove that activity 
is necessary for health, but it may serve as a useful launching point for the 
review and discussion that follows. 


Development of Exercise Science 


The scientific study of exercise is a recent development (62). Physiologists in 
the latter part of the nineteenth century began to use exercise to perturb body 
systems to understand physiological functioning better. Indeed, three exercise 
physiologists, Meyerhof (muscle metabolism) and Krogh, and Hill (physiolo- 
gy of exercise), have been awarded the Nobel prize for their research (74). 

Over the past 70 years, hundreds of studies have documented the type and 
extent of changes with physical training that occur in skeletal muscle, the 
circulatory system, pulmonary function, the heart and vascular system, and 
endocrine function. These studies have been done in the young and the 
elderly, in men and women, with different training protocols, and under 
varying environmental conditions. The earlier studies typically had small 
samples, frequently lacked control groups, were short-term, and had other 
design flaws. These shortcomings have been overcome in studies over the 
past 10—-20 years. 

Systematic studies on the health effects of physical activity are more recent, 
primarily confined to the past 30-40 years. Morris et al (75-77) are generally 
credited with a leading role in formulating the modern physical activity— 
coronary heart disease hypothesis with their studies on London transport 
workers and, later, on British civil servants. 


Definitions 


Several key terms, central to the purpose of this chapter, need to be defined. 
We adopt the definitions of Caspersen et al (17) for physical activity, ex- 
ercise, and physical fitness: 


1. Physical activity: Any bodily movement produced by skeletal muscles that 
results in energy expenditure. 
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. Exercise: Planned, structured, and repetitive bodily movement done to 
improve or maintain one or more components of physical fitness. 

. Physical fitness: A set of attributes that people have or achieve that relates 
to the ability to perform physical activity. 


The physical fitness component that has been most frequently studied for an 
association to health is aerobic power or, as it is measured in the physiology 
laboratory, maximal oxygen uptake. This attribute is also called cardiovascu- 
lar, cardiorespiratory, or endurance fitness. Unless otherwise specified, we 
use the term physical fitness to refer to aerobic power. 

The other major term that needs to be defined is health. In this chapter, we 
take a broad view of health, one that not only includes freedom from disease, 
but also the ability to achieve activities of daily living. Disease endpoints are 
frequently used in studies of physical activity. For our purposes, however, the 
definition of health goes beyond freedom from clinical disease to include a 
focus on functional capability or functional health status. This latter 
characteristic includes avoidance of functional disability, but also extends to 
higher levels of functional capability. One of the most well-documented 
effects of regular physical activity is a higher level of physical fitness. This 
permits a higher level of functional ability to participate in a wide array of 
life’s activities with ease and enjoyment. The active and fit person is not 
likely to become fatigued by the routine activities of daily living and has a 
greater capacity to meet emergencies or participate in vigorous recreational 
activities. 

Purpose of this Chapter 

This chapter reviews existing clinical exercise studies and population-based 
investigations of physical activity and physical health. We concentrate on 
potential preventive etiologic associations, with little emphasis on therapeutic 
effects of physical activity on health and disease. We integrate the findings 
from these two research fronts, point out agreements and disagreements, and 
summarize the results to assess how much physical activity is required for 
health. The descriptive epidemiology of physical activity in the United States 
and the public health burden of a sedentary lifestyle is discussed, and public 
health recommendations for physical activity and physical fitness are pre- 
sented. 


CLINICAL EXERCISE STUDIES 


Exercise and Physical Fitness 

Exercise-trained individuals have higher levels of physical fitness, and the 
relation between activity and fitness was probably known in antiquity. 
Athletes and soldiers have long been trained to improve their capacity for 
performance. Carefully done studies to quantify the training required to 
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produce an improvement in fitness are a recent phenomenon; in 1957, Kar- 
vonen et al (55) published one of the first of these studies. Dozens of studies 
over the past 35 years focused attention on three principles of exercise 
prescription: intensity, frequency, and duration (5). 


INTENSITY For the past several decades, the generally held view is that 
there is a minimum exercise intensity required to stimulate an improvement in 
physical fitness. The American College of Sports Medicine (ACSM) was the 
first scientific organization to publish official statements on exercise prescrip- 
tion. Their 1975 textbook set 70% of maximal oxygen uptake as the minimum 
recommended exercise intensity for improving physical fitness (4). Subse- 
quent studies lowered recommendations for the intensity threshold, and the 
third edition of the ACSM book in 1986 (3) recommended a minimum 
exercise intensity of 50%. The 1991 fourth edition (2) recommends moderate 
exercise, defined as exercise between 40-60% of maximal capacity, as 
appropriate for many persons. A 1990 ACSM position stand states that 
“persons with a low fitness level can achieve a significant training effect with 
. . - 40-50%” of capacity (5). An alternate hypothesis to a threshold level of 
intensity is that the response to exercise training is primarily, if not ex- 
clusively, dependent upon the total energy expended in exercise and not 
intensity. This distinction is important and needs additional clarification. If a 


minimum intensity threshold exists, it probably varies depending upon the 
initial fitness level of the participant, the duration of the exercise session, the 
length of the training period, and perhaps other individual characteristics of 
the person undergoing training. 


DURATION The ACSM recommends 20-60 minutes of continuous aerobic 
activity for each exercise session (2, 5). There is an interrelationship between 
intensity and duration in their impact on fitness change. Low intensity activity 
must be sustained longer than high intensity activity to have the same effect 
on improvement in aerobic power. Again, the total energy expenditure of the 
exercise session is likely the critical determining factor for fitness change. 

Investigators have challenged the belief that continuous aerobic activity is 
necessary to achieve a training effect. A recent study addresses the issue by 
comparing two different training regimens (21). One group trained five days 
per week with one 30-minute session per day. A second group trained five 
days per week with three 10-minute sessions per day. Improvements in 
physical fitness after eight weeks of training were similar, thus suggesting that 
the accumulation of activity over the course of the day can produce a training 
effect. 


FREQUENCY The ACSM recommends participation in exercise training 
three to five days per week (2, 5). Most studies show little change in physical 
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fitness if exercise is done less than three days per week, unless the exercise is 
quite strenuous. And, exercising more than five days per week does not result 
in greatet improvement in fitness than training five days per week (5). 


Physiological Effects of Acute and Chronic Exercise 


The potential beneficial effects of acute and chronic exercise on physical 
fitness and health have been intensely investigated in recent years. Existing 
laboratory and clinical studies have documented a broad array of physiologic 
benefits, including metabolic, hormonal, and cardiovascular adjustments that 
are evident at rest, as well as during and following both maximal and 
submaximal exertion (14). Acute and chronic exercise also reduce anxiety and 
depression and positively impact other psychological characteristics of both 
normal persons and those with clinical disorders (99). In this section, we 
focus only on those key physiological benefits that have been hypothesized to 
contribute toward a reduced risk for mortality, especially from cardiovascular 
disease and cancer. 


IMPROVEMENT OF BALANCE BETWEEN MYOCARDIAL OXYGEN DEMAND 
AND SUPPLY The myocardial oxygen requirement during exercise is de- 
termined by a variety of factors, the most important of which are reflected by 
the rate-pressure product (that is, the product of the heart rate and systolic 
blood pressure) (2). Because the rate-pressure product increases linearly 
during graded exercise, so too does the myocardial oxygen demand. Follow- 
ing exercise training, the rate-pressure product elicited by a given submaximal 
exercise intensity is usually substantially attenuated (117). This enables a 
specific physical activity to be performed with a lessened myocardial oxygen 
demand and, therefore, a reduced risk for myocardial ischemia. 

Currently, there is no direct evidence that exercise conditioning induces the 
formation of coronary collaterals in humans, and this issue will probably not 
be resolved until more sophisticated techniques for assessing coronary col- 
lateralization are developed and utilized in clinical exercise training studies 
(56). However, there is now preliminary evidence that exercise training may 
indeed enhance myocardial oxygen delivery and/or utilization (29, 56). 


ECCENTRIC VENTRICULAR HYPERTROPHY Myocardial hypertrophy is an 
adaptive mechanism that develops in response to increased hemodynamic 
loading of the heart. Depending on the specific nature of the hemodynamic 
loading, the resultant increase in cardiac mass is associated with characteristic 
alterations in the volume of the cardiac cavities and in the thickness of their 
walls. As an adaptive response to volume overloading of the left ventricle, 
dynamic exercise training often produces an increase in left ventricular wall 
thickness and, to a greater degree, chamber size. This so-called eccentric 
hypertrophy is believed to be associated with an increase in myocyte vascular- 
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ity that is commensurate with the degree of hypertrophy of the myocytes 
themselves and thereby improves myocardial function and assures myocyte 
health (123). 

Left ventricular function is a principal determinant of the risk of mortality 
following an acute myocardial infarction. Because persons with eccentric 
hypertrophy could suffer relatively less impairment in left ventricular function 
for a given amount of myocardial damage, Ekelund et al (31) have hypothe- 
sized that they may be better able to survive an acute myocardial infarction. 


REDUCED RISK FOR LETHAL VENTRICULAR ARRHYTHMIAS — Noakes et al 
(81) have shown that the exercise-trained rat heart has a reduced propensity 
for ventricular fibrillation during normoxia, hypoxia, and acute regional 
myocardial ischemia. They have further demonstrated that exercise training 
increases the ventricular fibrillation threshold of the previously infarcted 
isolated rat heart before and after the onset of reinfarction (96). These findings 
imply that regular exercise, before or after an acute myocardial infarction, 
may act directly on the myocardium to enhance its resistance to lethal 
ventricular arrhythmias. Although human studies are needed to substantiate 
this hypothesis, it is compatible with the finding of metaanalyses, which 
demonstrate that cardiac rehabilitation protects against mortality (which is 
mostly related to lethal ventricular arrhythmias) rather than reinfarction (82, 


83), and epidemiologic studies, which link a physically active lifestyle with a 
reduced risk for sudden cardiac death (76, 84). 


FAVORABLE EFFECT ON BLOOD COAGULABILITY Total occlusion of a 
coronary artery as a result of thrombus formation at the site of an atheroscie- 
rotic stenosis is believed to be the final precipitating event in more than 90% 
of acute myocardial infarctions. Although conflicting findings have been 
reported and additional research is still needed, exercise training is thought to 
reduce the adhesiveness and aggregability of blood platelets (30, 101). 
Moreover, whereas physical inactivity appears to decrease fibrinolysis, ex- 
ercise training tends to moderately augment it (30), which would improve the 
body’s ability to dissolve thrombi if they form. 


IMPROVED PLASMA LIPIDS AND LIPOPROTEINS Table | presents a sum- 
mary of the effect of acute and chronic exercise on plasma lipids and 
lipoproteins. Of these benefits, perhaps the most relevant is the increase in 
high density lipoprotein (HDL)-cholesterol. Generally, a single bout of mod- 
erate-to-long duration aerobic exercise evokes a 4-6 mg/dl increase in the 
HDL-cholesterol levels of men and women (41). Recent studies by Hughes et 
al (51, 52) further suggest that although exercise intensity does not appear to 
be a significant modifier of the acute impact of aerobic exercise on HDL- 
cholesterol levels in men, exercise duration does. In their study, the increase 
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Table 1 Results of studies investigating the relationship between aerobic ex- 
ercise training and lipoprotein levels*” 








Studies of acute Cross-sectional — Longitudinal 
exercise studies studies 





Total Cholesterol | —~» 4 
VLDL | 
LDL 
HDL 


Total cholesterol/HDL { 


v 





* | = generally, a decrease has been found; f = generally, an increase has been found; 
— | = generally, no change or a decrease has been found. 
» Reproduced with permission from Ref. 41 


in serum HDL-cholesterol levels at 24 hours after a bout of exercise per- 
formed at an oxygen uptake of 20% below the anaerobic threshold was greater 
when the exercise duration was 45 minutes, as compared with 30 minutes 
(52). 


Likewise, although not all studies are in agreement, results generally show 


a 5—15% increase in plasma HDL-cholesterol levels following chronic ex- 
ercise training (41). In men, such increases appear to be directly related to 
both the intensity of exercise and total quantity of weekly energy expenditure 
(126). In women, recent research conducted at the Institute for Aerobics 
Research suggests that moderate intensity exercise training performed at 
approximately 55% of the maximal heart rate may be as effective in increas- 
ing HDL-cholesterol levels as higher intensity exercise training (25). 


REDUCED RISK FOR HYPERTENSION AND LOWERING OF HIGH BLOOD 
PRESSURE Epidemiologic studies have documented a reduced risk for the 
development of hypertension in physically active persons (42). Several stud- 
ies have also demonstrated that the blood pressures of hypertensive patients 
are reduced for one to three hours following a single 30-45 minute bout of 
aerobic exercise (42). Moreover, a recent metaanalysis of 25 longitudinal 
studies has confirmed the efficacy of aerobic exercise training in lowering 
elevated systolic and diastolic blood pressures (43). The average sample-size- 
weighted reductions in resting systolic and diastolic blood pressures in this 
metaanalysis were 10.8 and 8.2 mmHg, respectively. Interestingly, in the 
studies included in the metaanalysis, moderate-intensity exercise appeared to 
be just as effective—if not more so—than higher-intensity exercise. 


ENHANCED INSULIN SENSITIVITY Findings from the Framingham study 
indicate that the incidence of cardiovascular disease among individuals with 
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diabetes mellitus is approximately two to three times higher than that in 
normoglycemic individuals (53). Recent research has further shown that 
insulin enhances the proliferation of arterial smooth-muscle cells and stimu- 
lates lipogenesis in arterial tissue (34). Not surprisingly, hyperinsulinemia has 
also been linked to an accentuated risk for acute myocardial infarction, even 
in nondiabetic men (24). 

Acutely, a single bout of submaximal aerobic exercise enhances insulin 
sensitivity in skeletal muscle and other tissues. Therefore, such exercise often 
results in a decline in the blood glucose levels of patients with insulin- 
dependent or noninsulin dependent diabetes mellitus (122). This exercise- 
induced improvement in glucose metabolism may persist from hours to days 
and is thought to be modulated by an increase in the cell membrane glucose 
transporter number, as well as an increase in the intrinsic activity of these 
transporters (59). 

With chronic exercise training, glycemic control also improves in persons 
with noninsulin-dependent and, to a lesser degree, insulin-dependent diabetes 
(122). However, as is partly the case with plasma lipoproteins and blood 
pressure, it is unclear whether such improvements are largely due to the 
cumulative effects of the individual acute bouts of exercise, rather than a 
training-mediated change in fitness per se (122). 


REDUCTION OF OBESITY AND IMPROVEMENT IN BODY FAT DISTRIBU- 
TION Caloric restriction through dieting, in combination with caloric ex- 
penditure through regular exercise, appears to be the most effective means of 
preventing obesity and maintaining an ideal body weight. This approach, as 
compared with dieting alone, better preserves lean body mass and may 
possibly be linked to favorable chronic changes in resting metabolic rate (35, 
94, 120). Regular exercise may also be associated with benefits in terms of 
both maintenance and stability of weight loss (57). 

Recent studies have shown that many of the adverse consequences of 
obesity may be more closely coupled to the distribution of body fat than to the 
amount of body fat (8). Indeed, individuals with more fat on the trunk, 
especially intraabdominal fat, are at increased risk of death when compared 
with individuals who are equally fat, but whose fat is predominantly on the 
extremities (8). Although additional studies are needed, regular exercise 
appears capable of evoking favorable changes in body fat distribution (23). 
Indeed, preliminary exercise-training studies suggest a preferential mobiliza- 


tion of trunk subcutaneous fat as compared with peripheral subcutaneous fat 
(23). 


ENHANCEMENT OF IMMUNOLOGIC FUNCTION _ In view of existing evidence 
that physical activity decreases the risks of colon cancer (especially in men) 
and breast and reproductive cancer in women, together with the recognized 
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importance of the immune system in the body’s defense against neoplasia, it 
is understandable why the immunology of exercise is currently an active area 
of research (15, 108). Although both acute and chronic exercise have been 
associated with potentially beneficial immunologic consequences, the hypoth- 
esis that an exercise-induced enhancement of immunosurveillance contributes 
to a decreased cancer risk is currently controversial and in need of consider- 
able future research. Indeed, many experts now believe that the mechanism 
by which regular physical activity may protect against certain types of cancer 
is nonimmunologic in nature (15, 108). Such nonimmunologic mechanisms 
are thought to include a reduction in intestinal transit time, in the case of colon 
cancer (61), and hormonal alterations (for example, decreased estrogen levels 
and consequently less end-organ stimulation), in the case of breast and 
reproductive cancers (15, 108). 


Summary of Clinical Exercise Studies 


Clinical studies confirm that exercise influences many bodily systems and 
functions. Several possibly healthful effects of exercise have been identified. 
Some of these effects are acute responses to a single bout of exercise; others 
result from chronic training adaptations. 


EPIDEMIOLOGICAL STUDIES OF ACTIVITY OR 
FITNESS AND HEALTH 


Cardiovascular Diseases 


Increased risk of cardiovascular diseases caused by sedentary lifestyle has 
been evaluated in more epidemiological studies than for all other disease 
endpoints combined, and coronary heart disease (CHD) is by far the most 
frequently studied of the cardiovascular diseases. Numerous review papers 
are available on the risk of CHD associated with sedentary habits; in 1987, 
Powell et al (97) published one of the most comprehensive of these papers. As 
it has been established that sedentary habits are causally related to increased 
risk of CHD, we will not review this topic in detail. 


HYPERTENSION Cross-sectional studies show lower blood pressures in ac- 
tive and fit persons, compared with their unfit and sedentary peers (19, 40). 
The magnitude of differences in blood pressure across activity or fitness 
groups is modest, typically less than 10 mm Hg for systolic pressure and 5 
mm Hg for diastolic pressure. This association appears to be independent of 
potential confounding variables, such as body fat, alcohol intake, family 
history of hypertension, and age. However, activity does not seem to normal- 
ize the blood pressure in all hypertensive persons (43). 

One prospective epidemiological study evaluated change in physical fitness 
in relation to change in blood pressure (10). A total of 753 middle-aged men 
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were followed for an average of 1.6 years, with physical fitness assessed at 
baseline and follow-up examinations by maximal exercise treadmill testing. 
Increases in fitness and decreases in body weight were associated with 
decreases in systolic and diastolic blood pressures. The association between 
fitness change and blood pressure change disappeared in multiple regression 
models when change in body weight was added. Thus, the effect of fitness 
change on blood pressure was largely mediated by changes in weight. 

There are two prospective studies on sedentary habits or low levels of 
physical fitness on risk of developing physician-diagnosed hypertension. Both 
studies followed large groups [14,998 Harvard alumni (90) and 4820 men and 
1219 women from the Cooper Clinic (11)] for up to 12 years. All study 
participants were free of diagnosed hypertension at baseline. The risk of 
developing physician-diagnosed hypertension during follow-up was increased 
by 35% in sedentary as compared with active alumni, and by 52% in unfit as 
compared with fit Cooper Clinic patients. These results were not due to 
confounding by such factors as age, smoking habit, family history of 
hypertension, or body composition. 


STROKE There are only a few epidemiological reports on physical activity 
or fitness and incidence of stroke, and the findings are equivocal. Results of 
these studies are displayed in Table 2. A problem in interpreting these data is 
that most studies do not distinguish between hemorrhagic and nonhemorrhag- 
ic (thromboembolic) stroke. We may reasonably expect that physical activity 
or fitness could have an impact on nonhemorrhagic stroke, as this disease 
seems to have a similar pathogenetic mechanism as that ascribed to CHD, and 
activity and fitness are inversely related to CHD. Activity and fitness might 
affect risk of hemorrhagic stroke indirectly via an association with blood 
pressure, but the association, if present, would likely be weak. Stroke in- 
cidence in the Harvard alumni study shows a strong inverse gradient across 
leisure time physical activity in kilocalories per week (86). Job-related activ- 
ity shows a U-shaped relationship with stroke among Italian railroad workers. 
Workers in both sedentary and heavy activity categories have an elevated 
relative risk of 2.2 compared with workers in the moderate activity group 
(73). We consider the possible relationship between activity or fitness and 
stroke to be likely, but not established. As evidenced by Table 2, problems in 
further interpretation stem from varying definitions of physical activity (occu- 
pational/leisure time, lifetime versus point estimate), outcome, and differ- 
ences in populations under study. 


PERIPHERAL VASCULAR DISEASE If an active and fit way of life reduces 
the risk of atherosclerotic coronary disease, it might also affect peripheral 
atherosclerotic disease. Investigators from the Framingham Heart Study ex- 
amined the 14-year incidence of peripheral artery disease by physical activity 
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index at baseline in men aged 35-64 years (54). Bivariate and multivariate 
analyses showed no relationship between activity and peripheral artery dis- 
ease. 


Cancer 


Nearly 70 years ago, investigators noted that death rates from cancer among 
men classified by occupational assignment were inversely related to energy 
expenditure from muscular activity (18, 109). More recently, evidence has 
accumulated that physical activity may protect against colon, but not rectal, 
cancer (1, 38, 39, 61, 95, 107, 110, 121, 125, 127). 

Physical activity assessment at a single time may not reflect activity over 
the long term, and long-term activity may be important for diseases such as 
cancer, which has a long development stage. Two points of activity assess- 
ment (1962 or 1966 and 1977) were obtained in 17,148 Harvard alumni who 
were followed prospectively for colon and rectum cancer occurrence by 1988 
(67). Higher levels of physical activity, which were evaluated by using either 
assessment taken alone, were not associated with colon cancer risk. However, 
alumni who were highly active (energy expenditure of 2500 or more kcal per 
week) at both assessments had half the risk of developing colon cancer as 
those who were inactive (less than 1000 kcal per week) at both assessments. 
Thus, either consistently higher levels of activity are necessary to protect 
against colon cancer, or combining two assessments increases the precision of 
the physical activity measurement. No evidence was found that higher levels 
of activity protected against rectum cancer. 

Clinical and laboratory studies have suggested a role of testosterone in the 
development of prostate cancer. Exercise may have physiologic affects on sex 
hormone production and utilization. Accordingly, these same Harvard alumni 
were followed for the incidence of this cancer in the same 26-year period (68). 
Although men who were highly active (expending 4000 or more kcal per 
week at both assessments) were at reduced risk of prostate cancer, there was 
no gradient response of protection at lower levels of energy expenditure, and 
these findings need to be repeated. 

In like fashion, observations suggesting a lower risk of breast cancer among 
women athletes as compared with nonathletes (36) are based on small num- 
bers and must be interpreted cautiously. Further, this particular study is based 
on interviews with women who have survived breast cancer, and selection or 
survival biases cannot be ruled out in interpreting the findings. 

Physical fitness, as assessed by maximal exercise tolerance on a treadmill 
test, is inversely associated with cancer mortality in the Aerobics Center 
Longitudinal Study (12). There were 64 cancer deaths in 10,224 men and 18 
cancer deaths in 3120 women who were followed for an average of eight years 
(total of 110,482 person-years of observation). Age-adjusted cancer death 
rates per 10,000 person-years of observation across low, moderate, and high 
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physical fitness categories were 20, 7, and 5 in men and 16, 10, and 1 in 
women, and these trends are statistically significant. The number of deaths in 
this study is relatively small at this time and precludes an evaluation of the 
association of fitness with site-specific cancer deaths. All patients in the 
analysis were apparently healthy at baseline; persons with a history or evi- 
dence of several chronic diseases were excluded. However, some individuals 
probably had subclinical cancer already present at baseline. Undetected dis- 
ease could cause lassitude and inactive habits and result in lower fitness 
levels. Thus, some of the association between fitness and cancer mortality 
may have been due to cancer, thus causing low fitness. However, the inverse 
gradient of cancer mortality across fitness groups is striking and indicates a 
need for additional research. 


Diabetes (NIDDM) 


Noninsulin-dependent diabetes mellitus (NIDDM), which affects 10-12 million 
persons age 20 years or older, is a complex disorder characterized by increased 
insulin resistance and impaired insulin secretion. This disorder leads to increased 
risk of mortality from CHD and to other vascular complications, such as periph- 
eral vascular disease, kidney disease, and blindness (22, 28, 80). Along with 
proper control of body weight and a prudent diet, physical activity is commonly 
advocated in the management of NIDDM (50, 80, 106, 128), but it has been little 


studied in the prevention or deferment of this disease. Certain indirect lines of 
evidence support the contention that physical activity lowers risk of NIDDM. For 
example, physically active societies have less NIDDM than more sedentary 
societies (7, 26, 124); as populations have become less active, the incidence of 
this disease has increased steadily. Physical activity increases insulin sensitivity 
(103, 112), and regular endurance exercise induces weight loss and positive 
changes in glucose metabolism (59, 100). Physical activity has also been in- 
versely associated with the prevalence of diabetes in several cross-sectional 
studies (37, 58, 78, 116). 

Direct evidence of a protective role of physical activity against NIDDM has 
been demonstrated in a prospective study of University of Pennsylvania 
alumni (47, 89). By using mail questionnaires, contemporary physical activ- 
ity patterns and other life-style habits were examined in relation to the 
incidence of NIDDM in 5990 men; the disease developed in 202 of these men 
in 15 years of follow-up. 

Leisure-time physical activity, expressed as kilocalories (kcal) in walking, 
stair climbing, and recreational activities, was inversely related to the de- 
velopment of NIDDM. Incidence rates declined as energy expenditure in- 
creased from less than 500 to 3500 or more per week. For each 500 kcal 
increment in energy expenditure, diabetes was reduced by about 6%, and this 
inverse relationship persisted when body composition, weight gain since 
college, history of hypertension, and parental history of diabetes were consid- 
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ered. The protective effect of physical activity was strongest with moderate to 
vigorous sports play. The effect was also strong in individuals considered at 
higher risk of NIDDM because they were overweight-for-height or hyperten- 
sive, or had parental history of diabetes. 

This study among college alumni supports the concept that prevention or 
delay of NIDDM may be achieved by increasing overall activity and that 
vigorous activities (swimming, brisk cycling, running, etc.) may induce a 
stronger effect than more moderate activities. 


Osteoarthritis 


Osteoarthritis is a major public health problem in the United States (79), and 
some investigators are concerned that vigorous exercise may increase risk of 
the disease developing. The title of a recent editorial in the Journal of Internal 
Medicine, “Jogging—for a healthy heart and worn-out hips?,” expresses a 
common concern that exercise may increase the risk of osteoarthritis (32). 
Cross-sectional studies show no differences in the prevalence of osteoarthritis 
between runners and control subjects (64, 91). A two-year follow-up study by 
Lane et al (63) also shows similar progression rates for osteoarthritis in 
runners and controls. 

A preliminary analysis of data from the Aerobics Center Longitudinal 
Study shows no increase in osteoarthritis of the hip or knee across levels of 
exposure to running (13). The six-year incidence of osteoarthritis in a group 
of 1039 women and 4429 men was higher in older and more obese subjects. 
But, the incidence was not higher in subjects who had run more miles in their 
lifetimes, had been running for more years, and had run more miles in the 
year before the beginning of the study. Although selection/protection compe- 
tition cannot be unraveled in these early data, available indications are that 
running and jogging are not associated with an increased risk of osteoarthritis 
of the hip or knee. 


Osteoporosis 


Osteoporosis, and the associated fracture risk, are major public health prob- 
lems, especially for older individuals. Peak bone mass is attained early in life, 
probably by the second or third decade (111). A gradual decline in bone 
mineral density occurs throughout middle-age and is markedly accelerated in 
women after menopause, especially during the first five postmenopausal years 
(111). Numerous studies on the relation of physical activity to bone mineral 
density have been conducted over the past several years. Two reviews (111, 
119) provide an excellent summary of these reports. 

The current research supports a few general conclusions. Clearly, bone 
responds to the physical stress of exercise. Regular physical activity is likely 
to boost peak bone mass in young women, probably slows the decline in bone 
mineral density in middle-aged and older women, and may increase bone 
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mineral density in patients with established osteoporosis (111). Much addi- 
tional research is needed to clarify the specific type and amount of exercise 
that most efficaciously promotes bone health at various stages of life. There is 
a paucity of studies of men, and this void also needs to be addressed. It is not 
clear how physical activity and other proven or suspected effective in- 
terventions, such as calcium supplementation and estrogen replacement ther- 
apy, might interact to promote or maintain bone health. 

Regular physical activity may provide benefits beyond a direct impact on 
bone mineral density. Active individuals have greater muscle mass and are 
stronger, which might reduce the risk of falling and protect against fractures 
when falls occur. Sorock et al (113) report a reduced risk of fracture (relative 
risk = 0.41 in men and 0.76 in women) in active individuals when compared 
with sedentary ones. 


Musculoskeletal Disability 

Musculoskeletal disorders are common, especially in older individuals. These 
disorders may contribute to inability to perform routine activities or to risk of 
falling. The high prevalence of relative disability in older persons is man- 
ifested by problems with walking, doing household chores, and accomplish- 
ing personal activities (20). Falls are a major health problem for the elderly. 
The etiology of falling is complex, and multiple factors are identified as 


possible causes; but, limitations in musculoskeletal function, such as low 
levels of muscle strength, balance, and flexibility, may be contributors (118). 

Runners report fewer limitations in routine activities and lower levels of 
disability than control subjects (65). Muscle dysfunction and problems with 
mobility are strongly associated with low levels of muscular strength (33). 
Furthermore, even elderly individuals (86-96 years) improve muscle strength 
with an eight-week, weight training program (33); in fact, average gains in 
strength of 175% were noted. Increases in strength were also associated with 
objective improvements in mobility tests. 

At present, data are limited, and more studies, including intervention trials, 
are needed to evaluate the possible impact of increased physical activity on 
the incidence of musculoskeletal disorders. However, older persons in par- 
ticular are clearly likely to suffer relative disability, decreased function, falls, 
and specific musculoskeletal disorders; some of these problems may be due to 
a progressive loss of musculoskeletal function caused by decades of sedentary 
living habits. Future work should focus on quantifying levels of activity and 
fitness required to prevent dysfunction and on appropriate and acceptable 
intervention programs to restore function. 


Summary of Epidemiological Studies 
DOSE-RESPONSE RELATIONSHIP Most of the general public and many 


health professionals believe that regular exercise is an important health habit. 
For the past two decades, exercise scientists have promoted a scientific 
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approach to exercise prescription that specifies exercise intensity, duration, 
and frequency (2-5). These recommendations are based on numerous con- 
trolled trials of exercise training that have characterized the shape of the 
dose-response relationship of exercise to short-term improvements in physical 
fitness. The exercise prescription emphasizes relatively vigorous, large mus- 
cle activity for at least 20 minutes at a minimum of three times per week. This 
dose of exercise was adopted by the Surgeon General of the United States for 
the 1990 health objectives (98). Many public education campaigns, books, 
and articles have presented the exercise prescription approach as advice to the 
public. We believe that these activities have led both the public and health 
professionals to adopt a dichotomous view of exercise. That is, unless a 
person achieves the specified exercise prescription, there are no benefits or 
responses to the training program. In our opinion, this is an incorrect view, 
especially in terms of the health effects of physical activity. 

The relation between various levels of physical activity or physical fitness 
to mortality from five recent prospective studies is presented in Figure 1. 
These studies indicate that there is a gradient of risk across activity or fitness 
levels and that moderate levels of activity or fitness are associated with 
important and clinically significant reductions in risk. This observation op- 
poses the widely believed threshold concept, which asserts that there is no 
benefit from physical activity until the exercise prescription level is reached 
and that there are further improvements across higher levels of exercise. 
Figure 2 illustrates an idealized benefit curve (solid line) across activity or 
fitness levels based on current studies, and a second hypothetical curve 
(dotted line) that probably represents the prevailing opinion of the public and 
health professionals. 

The dose-response relationship indicated by the five studies is good news 
for sedentary individuals. They can have hope that a moderate physical 
activity program is likely to yield some important health benefits. The public 
health message should be “Doing some physical activity is better than doing 
none at all.” That is, a little is better than none, and, to a degree, more is 
better than less. The moderate level of physical fitness that is associated with 
much lower death rates than the low fitness level in the Aerobics Center 
Longitudinal Study (12) can be achieved with relatively little activity. A 
brisk, two-mile walk in 30-40 minutes (3—4/mph) taken on most days would 
be sufficient to produce the moderate fitness level defined in the study. A 
recent randomized clinical trial suggests that three ten-minute walks over the 
course of the day have about the same impact on physical fitness as one 
30-minute walk (21). Thus, exercise recommendations can emphasize the 
accumulation of 30 minutes of walking (or the energy expenditure equivalent 
in some other activity) over the day as sufficient to have important health and 
functional benefits. This approach may be less intimidating and easier to 
follow than the prescription of a continuous exercise session and should be 
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Figure 1 Rates for coronary heart disease, cardiovascular disease, or all-cause mortality are 
plotted on the vertical axis. The horizontal axis indicates exposure to various levels of physical 
activity or physical fitness. The figure is constructed from data taken from five prospective 
epidemiological studies: A (69); B (75); C (87); D (31); E and F (12). The rates in the different 


panels cannot be compared directly because of different methodology, endpoints, and study 
populations. 
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Theoretical Risk of Morbidity or Mortality 





Level of Physical Activity or Physical Fitness 
Figure 2 The solid line indicates change in risk across levels of activity of fitness; this line is 
idealized from published prospective studies. The dashed (upper) line indicates the relation of 
disease endpoints to level of activity or fitness on the assumption that the traditional exercise 
prescription is required to obtain health benefits and that higher levels of activity or fitness 
produce additional benefits, as indicated by the decline in risk beyond the threshold point. 


considered for intervention programs (9, 46). A five-minute walk after break- 
fast and before dinner, a ten-minute walk at lunchtime, and a few minutes of 
stair climbing spread across the day would result in the accumulation of a dose 
of activity that should improve health and function in previously sedentary 
and unfit individuals. 


METHODOLOGIC ISSUES IN POPULATION STUDIES OF PHYSICAL ACTIVITY 
Design and methodological concerns are sometimes raised regarding the 
interpretation of data from epidemiological studies. In this section, we discuss 
the issues of bias and physical activity assessment. 


Bias Much has been written about bias in population studies, and most 
standard texts treat the topic thoroughly (48, 102). Epidemiological studies of 
physical activity, physical fitness, and health have been typically conducted 
in Opportunistic cohorts, such as college alumni (86, 87, 90), preventive 
medicine clinic patients (11, 12), or high risk men (69). Frequently, results 
from such studies are questioned because of possible selection bias. Selection 
bias is not a major problem in these studies, however, because persons 
enrolled in such studies come under observation before knowledge of any 
outcome. As in most epidemiological investigations, care must be taken when 
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generalizing the results, and replications in other groups are needed. One 
possible bias in existing studies is that the sedentary or unfit subjects may be 
in those categories because they may already have some disease, which in 
turn causes inactivity and concomitantly increases risk of death. Investigators 
have dealt with these problems by evaluating the relationship between activity 
or fitness to mortality in early and later follow-up intervals (12, 75, 87), or by 
considering changes in classifications of work activity (84). 


Assessment Issues Efforts have been made to validate physical activity 
assessment instruments used in population studies (60, 104, 115), but there 
are several important issues that need further attention. First is the temporality 
of the physical activity exposure, as it may be positioned in the etiologic 
pathway or constellation of diseases and disorders. All studies to date have 
typically relied on a single, point estimate of physical activity (or inactivity) 
as a measure of exposure. Earlier studies (77, 84, 85) assessed relative and 
absolute energy expenditure needs on the job, whereas the more recent studies 
focused on leisure-time physical activity (69, 86, 87, 90). As in the study of 
dietary intake and disease, investigators have assumed that these point es- 
timates of activity are correlated with the habitual, or lifetime, exposure to 
physical activity that is more plausibly in an etiologic pathway; this assump- 
tion has not yet been confirmed. The problem of misclassification of exposure 
to physical activity (either by a change in behavior during a follow-up period 
or by real assessment error) based on a single baseline measure is one that 
should serve to underestimate the true point estimate of risk. Thus, we may 
argue that any increased risk demonstrated with a single, point estimate of 
physical activity should only be strengthened with a more complete and 
accurate, and less variable, measure of physical activity exposure. This has 
not often been demonstrated; a notable exception is the above-mentioned 
study of physical activity and colon cancer incidence (67). 

The second issue is that even if the assumption of a single, point-estimate 
of physical activity is etiologically valid, it is unknown how many days (or 
views) of assessment are necessary to build a picture of true habitual energy 
expenditure. As with dietary intake (70), we can reasonably assume a certain 
degree of intraindividual variation in energy expenditure. Thus, how many 
assessment days are needed to minimize this intraindividual variation and 
provide unbiased estimates of physical activity habits? Such work has been 
done in the area of dietary intake (6, 70), but nothing is yet available for 
energy expenditure. This problem relates to measurement error and subse- 
quent misclassification of exposure in much the same way as was discussed 
above, and must be solved to provide more precise estimates of physical 
activity exposure. New approaches to physical activity assessment need to be 
developed to address these problems to approximate appropriate physiologic 
parameters of interest in different populations better. 
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DESCRIPTIVE EPIDEMIOLOGY OF PHYSICAL 
ACTIVITY IN THE UNITED STATES 


Physical Activity Within Demographic Groups 


The contributions of physical activity to a healthful lifestyle have received an 
increasing amount of emphasis over the past two or three decades. Casual 
observation that adults are becoming more physically active can be supported 
by data from national surveys that show small increases in the percentage of 
individuals who are active and decreases in the percentage who are sedentary 
(114). We are, however, not an active society; seven out of eleven of the 
Surgeon General’s objectives for activity and fitness for 1990 were probably 
not achieved (98). Data from the 1985 National Health Interview Survey 
show that 25% of adult men and 30% of adult women were sedentary (no 
reported physical activity in the past month) (16). Another 30% of men and 
women were classified as irregularly active, and only 8% of the men and 7% 
of the women were exercising at the level recommended in the 1990 objec- 
tives. Physical activity levels generally were inversely related to age and 
directly related to educational level and income. Whites appeared to be 
somewhat more active than blacks and persons with race not specified. 


Population Attributable Risks of Low Activity and Fitness 


The epidemiological studies reviewed above support the inference that low 
levels of physical activity and physical fitness are strong and independent risk 
factors for cardiovascular, cancer, and all-cause mortality. The high preva- 
lence of sedentary habits in the US thus leads to a high population attributable 
risk for sedentary lifestyle. Paffenbarger et al (87) calculate the population 
attributable risk for all-cause mortality for sedentary habits (< 2000 kilocalor- 
ies per week in physical activity; approximately 60% of the Harvard alumni 
were at risk by this definition) to be 16%, compared with 6% for hyperten- 
sion, 22% for cigarette smoking, and 5% for a positive family history of early 
parental death. Low physical fitness (least fit quintile) in the Aerobics Center 
Longitudinal Study was associated with population attributable risks of 9% in 
men and 15% in women (12). These risk estimates were comparable to, or 
higher than, the estimates for other well established risk factors, such as 
cigarette smoking, elevated blood cholesterol or blood pressure, high fasting 
blood glucose, high body mass index, and a history of premature coronary 
heart disease death in a parent. 

Hahn et al (44) recently estimated the number of deaths attributed to several 
risk factors for nine chronic diseases. The estimates were based on published 
studies and death rates in the US in 1986. The number of deaths attributed to 
sedentary habits [sedentary or irregularly active as described by Caspersen et 
al (16)] was 256,686. This number was exceeded by the estimates for 





120 BLAIR ET AL 


smoking (361,911) and obesity (261,988), but was greater than the numbers 
estimated for elevated cholesterol (253,194) or hypertension (225,962). 

Population-attributable risk estimates for sedentary habits and low physical 
fitness are high. Inactivity in the US appears to be a public health problem that 
is of comparable magnitude to cigarette smoking, obesity, high blood pres- 
sure, and high blood cholesterol levels. 


SUMMARY 


Research studies over the past several decades confirm the health benefits of 
regular physical activity, a concept with foundations in antiquity. The effects 
of activity on certain individual health conditions, the precise dose of activity 
that is required for specific benefits, the role (if any) of intensity of effort, and 
the elucidation of biological pathways whereby activity contributes to health 
are topics for further research. Although details remain to be clarified, it is 
now clear that regular physical activity reduces the risk of morbidity and 
mortality from several chronic diseases and increases physical fitness, which 
leads to improved function. Table 3 outlines the relationship of activity to 
several diseases, a judgment on the strength of the evidence, and a rough 
determination of the amount of research extant. Results from clinical exercise 
studies and epidemiological investigations can be integrated into a consistent 
and coherent theory of healthful physical activity. However, some differences 
between these two research streams need to be reconciled. Exercise physiolo- 
gists have generally recommedned relatively intensive activity and a formal 
approach to exercise prescription. The epidemiological studies suggest a 
linear dose-response relationship, at least up to a point, between physical 
activity and health and functional effects. These data support public health 
recommendations directed toward the most sedentary and unfit stratum of the 
population and emphasize doing at least moderate physical activity. If this 
group of adults would accumulate 30 minutes of walking per day (or the 
equivalent energy expenditure in other activities), they would receive clinical- 
ly significant health benefits. An important point is that it does not matter 
what type of physical activity is performed: Sports, planned exercise, house- 
hold or yard work, or occupational tasks are all beneficial. The key factor is 
total energy expenditure; if that is constant, improvements in fitness and 
health will be comparable. There are probably 40 million adults in the US 
whose sedentary habits place them at considerably increased risk of morbidity 
and mortality from several diseases. These same individuals also are more 
likely to have functional limitations, especially as they move into the later 
years of life. 

The sizable independent relative risk for impaired health in sedentary 
persons, and the large number at risk, leads to a substantial public health 
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Tabie 3 Summary results of studies investigating the relationship of physical 
activity or physical fitness to selected incidences of chronic diseases*” 








Trends across activity or fit- 
Number of ness categories and strength 
Disease studies of evidence 





Obesity +e SEN 
Coronary artery disease +48 
Hypertension — 
Stroke si 
Peripheral vascular disease - 
Cancer (all sites) . 

colon 


| 
¥ 


rectum 
breast 
prostate 
lung 
Non-insulin dependent diabetes 
Osteoarthritis 
Osteoporosis ’ | 
Musculoskeletal disability 1 


| 
¥ 
+ 





**Few studies, probably less than 5; **several studies, approximately 5-10; ***many 
studies, more than 10. 

>—» No apparent difference in disease rates across activity or fitness categories; | some 
evidence of reduced disease rates across activity or fitness categories; | | good evidence of 
reduced disease rates across activity or fitness categories, control of potential confounders, 
good methods, some evidence of biological mechanisms; | | | excellent evidence of 
reduced disease rates across activity or fitness categories, good control of potential confound- 
ers, excellent methods, extensive evidence of biological mechanisms, relationship is consid- 
ered causal. 


burden. This problem deserves continued and increased attention by physi- 
cians and other health professionals, scientists, and the public health es- 
tablishment. 
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INTRODUCTION 


Within months after Roentgen’s discovery of the x-ray in 1895, burns of the 
skin and other acute injuries were encountered in pioneer radiation workers 
(12, 27), thus prompting efforts to reduce the levels of occupational exposure. 
In subsequent years, as the thresholds for different types of reactions gradual- 
ly became better known, the exposure limits for radiation workers underwent 
a series of further reductions (82, 83). As a result, protection standards 
ultimately evolved that now suffice to prevent gross damage of tissue, barring 
radiation accidents (37). Although today’s protection standards are adequate 
to prevent gross injury, they are not presumed to protect completely against 
the mutagenic and carcinogenic effects of radiation, which may have no 
thresholds (57, 87). 

The concept that the mutagenic effects of radiation might have no threshold 
dates from the 1940s, when classical experiments on the fruitfly suggested 
that the frequency of mutations increases in proportion to the dose of x-rays, 
without a threshold (54). In the 1950s, apprehension that future generations 
might suffer genetic harm from the global increases in environmental radia- 
tion caused by atmospheric testing of nuclear weapons prompted national (55) 
and international (94) assessments of the risks, which have been periodically 
updated (87, 89-93). 

Concern about the hazards associated with low-level irradiation was further 
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heightened in the 1950s by the suggestion that the risk of leukemia may also 
increase in proportion with the radiation dose, and that a significant percent- 
age of such cancers in the general population may, therefore, result from 
natural background irradiation (44). This suggestion was reinforced at about 
the same time by the observation of an association between childhood leuke- 
mia and prenatal diagnostic x-irradiation (47, 79, 80). Since then, the non- 
threshold dose-incidence hypothesis has also been extended to other malig- 
nancies; the risks of radiation-induced cancer are now thought to exceed the 
risks of heritable effects in the low dose domain (57, 87). 

Because assessments of the risks of low-level irradiation are highly un- 
certain, they have been a subject of ongoing scientific controversy. As a result 
of such controversy, the fact that the risk estimates have continued to change 
with the evolution of new information, and confusion over terminology (e.g. 
the introduction of new units of measure), the assessments have failed to gain 
public understanding and credibility. To put the pertinent public health issues 
into perspective, we review the status of our knowledge of the effects of 
low-level ionizing radiation, with particular reference to the implications of 
recent revisions in the relevant risk assessments by national and international 
committees. 


SOURCES AND LEVELS OF RADIATION IN THE 
ENVIRONMENT 


Radiation Quantities and Units 


Radiation quantities and doses are expressed in various units (35). The unit 
now used internationally for expressing the dose of radiation that is absorbed 
in tissue is the gray (Gy). The unit formerly used for the same purpose is the 
rad (1 Gy = 1 joule per kg of tissue = 100 rad). 

Because particulate radiations generally cause greater injury than x-rays or 
gamma rays for a given dose in Gy, another unit, the sievert (Sv), is used in 
radiological protection to enable doses of different types of radiation to be 
normalized in terms of biological effectiveness. Thus, the dose equivalent of 
any radiation in sieverts is the dose in Gy multiplied by an appropriate 
weighting factor, so that one seivert of any radiation represents, in principal, 
the dose that is equivalent in biological effectiveness to one gray of gamma 
rays. The unit formerly used for the same purpose is the rem (1 Sv = 100 
rem). 

Because the probability of injury from a given dose equivalent varies with 
the organ or tissue irradiated, a further unit—the effective dose equivalent—is 
also used. This unit, expressed in Sv, denotes the dose equivelent to the tissue 
of interest, weighted by the ratio between the resulting risk of injury and the 
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risk of injury that would be attributable to the same dose equivalent if it were 
delivered to the body as a whole. 

For expressing the collective dose equivalent to a population, the person-Sv 
(or person-rem) is used; this unit represents the product of the average dose 
equivalent per person times the number of persons exposed, e.g. | sievert to 
each of 100 persons = 100 person-sievert (= 10,000 person-rem). A col- 
lective dose equivalent that is expected to be received over a period of time 
extending into the future, as from an internally deposited radionuclide, is 
called the committed collective dose equivalent, or collective dose equivalent 
commitment. 

The amount of radioactivity that is present at any one time in a given 
sample of matter is expressed in bequerels; one becquerel (Bq) corresponds to 
that quantity of radioactivity in which there is one atomic disintegration per 
second. Another unit that has been used for the same purpose is the curie (Ci); 
one Ci represents that quantity of radioactivity in which there are 3 x 10!° 
atomic disintegrations per second (1 Bq = 2.7 x 107" Ci). 


Sources and Levels of Exposure 


Natural background radiation, which exists at varying intensities throughout 
the environment, comes from three main sources: cosmic rays; radium, 
thorium, uranium, and other radioactive elements in the earth’s crust; and 
potassium-40, carbon-14, and other radionuclides contained in living cells 
themselves. The three components of natural background radiation—cosmic 
rays, external gamma radiation, and internal radiation—each account for 
about one third of the total annual dose of slightly less than 1.0 mSv (100 
mrem) received on average by a person who resides at sea level in the US 
(Table 1). In certain locations, one of the components may be increased by a 
factor of two. For example, living at a mile-high elevation may increase the 
cosmic ray dose to 0.50 mSv (50 mrem) per year; living in an area where the 
earth is rich in radium may increase the external gamma-ray dose to 0.60 mSv 
(60 mrem) per year. However, both components are rarely doubled at the 
same time, and the dose from internal emitters (other than radon) is quite 
constant, so that the total annual dose is unlikely to increase by more than 
30%, depending on location (62). 

Moreover, the average dose to the bronchial epithelium from inhaled radon 
and its daughters greatly exceeds the dose to any other soft tissue of the body 
(Table 1), and regions of the bronchial epithelium in smokers receive addi- 
tionally even larger doses from polonium-210, which is present naturally in 
tobacco smoke (65). Also, at the high altitudes of modern jet aircraft travel, 
the hourly dose-equivalent may exceed 0.005 mSv (0.5 mrem) on northerly 
routes, thus reaching levels many times higher during solar flares (62). 
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Table 1 Average amounts of ionizing radiation received annually from different 
sources by members of the US population* 








Source Dose Equivalent” Effective Dose 





(mSv) (mrem) (mSv) (%) 

Natural 

Radon‘ 24 2,400 2.0 

Cosmic 27 0.27 

Terrestrial 28 0.28 

Internal 39 0.39 

Total natural - —- 3.0 
Artificial 

Medical 

x-ray diagnosis 0.39 a 0.39 

Nuclear medicine 0.14 0.14 

Consumer products 0.10 0.10 
Other 

Occupational 0.009 <0.01 

Nuclear fuel cycle <0.01 <0.01 

Fallout <0.01 ; <0.01 

Miscellaneous? <0.01 <0.01 

Total artificial 0.63 0.63 
Total natural and artificial a 3.6 





“From 57, 63, 64. 

> Average dose equivalent to soft tissues. 

© Dose equivalent to bronchial epithelium from radon daughter products. The assumed weight- 
ing factor for the effective dose equivalent, relative to whole-body exposure, is 0.08. 

“Department of Energy facilities, smelters, transportation, etc. 


In addition to the dose received from natural background radiation, the 
popuiation is exposed to radiation from various anthropogenic sources, the 
most important of which is the use of x-rays in medical diagnosis (Table 1). 
Lesser amounts of man-made radiation are received from various other 
sources, including radioactive minerals in building materials, phosphate 
fertilizers, and crushed rock; radiation-emitting components of TV sets, 
smoke detectors, and other consumer products; radioactive fallout from atom- 
ic weapons; and nuclear power (Table 1). 


NATURE AND MECHANISMS OF INJURY BY 
LOW-LEVEL IRRADIATION 


Through random collisions with atoms and molecules in cells, radiation gives 
rise to ions and reactive radicals, which, in turn, break chemical bonds and 
cause other molecular alterations that lead to injury. The distribution of the 
ionization events along the path of an impinging radiation depends on the 
energy, mass, and charge of the radiation; x-rays and gamma rays produce 
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ions sparsely along their tracks—that is, they are characterized by a iow rate 
of linear energy transfer (LET)—in contrast to charged particles, which are 
densely ionizing. 

Any molecular constituent of the cell can be altered by radiation, but DNA 
is the most critical biological target because of the limited redundancy of the 
genetic information it contains, i.e. damage to a single gene, if unrepaired, 
may kill or profoundly alter the affected cell. A dose of radiation large enough 
to kill the average dividing cell ( ~ 2 Sv, or 200 rem) causes hundreds of 
lesions in its DNA molecules; however, because most such lesions are 
reparable, the ultimate fate of a given lesion depends on the outcome of the 
cell’s DNA repair processes, as well as on the initial lesion itself (103). For a 
given dose, the probability of an irreparable lesion is far higher with a densely 
ionizing radiation (e.g. a proton or an alpha particle) than with a sparsely 
ionizing radiation (e.g. an x-ray or a gamma ray) (28). 

Unrepaired or misrepaired damage to DNA, in the form of mutations, has 
been well documented in many types of cells, including human lymphocytes 
(29) and erythrocyte precursors (42). At a given genetic locus, the frequency 
of mutations tends to increase linearly in the low-to-intermediate dose range, 
approximating 107° to 10~° per Sv (57, 87). The linear nonthreshold nature 
of the dose-response relationship implies that a mutation can, in principle, 
occasionally result from the traversal of the genetic target by a single ionizing 
particle. 

Radiation-induced damage may also break chromosomes, thus leading to 
changes in chromosome structure and number. The combined frequency of 
translocations, dicentrics, rings, and other chromosome rearrangements in- 
creases as a linear, nonthreshold function of the radiation dose in the low-to- 
intermediate dose range, approximating 0.1 per cell per Sv (100 rem) in 
human lymphocytes scored soon after irradiation in vitro (45). It is not 
astonishing, therefore, that the aberration frequency increases in the lympho- 
cytes of atomic bomb survivors, radiation workers, and persons residing in 
areas of elevated natural background radiation. Because the dose-response 
relationship has been well characterized, the aberration frequency can serve as 
a crude biological dosimeter in radiation accident victims (45). 

Radiation-induced damage to genes, chromosomes, and other vital 
organelles may be lethal to the affected cells, especially dividing cells, which 
are highly radiosensitive as a class. Measured in terms of proliferative capac- 
ity, the survival of dividing cells tends to decrease exponentially with increas- 
ing dose; 1-2 Sv (100-200 rem) generally suffices to reduce the surviving 
fraction by about 50% (89). The killing of individual cells is a stochastic 
process, but because too few cells are killed by a dose below 0.5 Sv (50 rem) 
to cause detectable tissue damage in most human organs, except those of the 
embryo (37), such effects are not considered further in this report. 
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HERITABLE (GENETIC) EFFECTS 


Heritable mutations and chromosomal abnormalities increase in frequency 
with the dose of radiation to the germ cells in Drosophila, laboratory mice, 
and various other organisms (57, 87). However, such effects of radiation have 
not yet been detected in humans. Most strikingly, an intensive study of more 
than 76,000 children of atomic bomb survivors of Hiroshima and Nagasaki, 
carried out over four decades, has failed to detect heritable effects of radia- 
tion, as measured by untoward pregnancy outcomes, neonatal deaths, 
malignancies, balanced chromosomal rearrangements, sex-chromosome an- 
euploids, alterations of serum or erythrocyte protein phenotypes, changes in 
sex ratio, or disturbances of normal growth and development (70). Although 
negative, these findings are not incompatible with the data from the mouse, 
given the limited size of the study population and the comparatively small 
average gonadal radiation dose in question. Therefore, the findings are not 
interpreted to indicate that human germ cells are resistant to radiation 
mutagenesis, but rather that a dose of at least 1.0 Sv (100 rem) is required to 
double their mutation rate (57, 70). 

The limited data that are now available permit only tentative estimates of 
the risks of radiation-induced heritable abnormalities in humans. These data 
imply that the percentage of all genetic diseases attributable to natural back- 
ground irradiation is small (Table 2); however, the estimates are fraught with 


Table 2 Estimates of the extent to which the frequencies of different heritable disorders 
are attributable to natural background irradiation* 








Contribution From 
Natural Background Radiation? 
Type of Disorder Prevalence First Generation Equilibrium 





(per million live births)° 





Autosomal dominant 180,000 20-100 300 
X-linked 400 <1 <15 
Recessive 2,500 <1 very slow increase 
Chromosomal 4,400 <20 very little increase 
Congenital abnormalities 20,000 - 30 30-300 
30,000 
Other disorders of complex 
etiology 
Heart disease 600,000 not estimated not estimated 
Cancer 300,000 not estimated not estimated 
Selected others 300,000 not estimated not estimated 





“From 57. 
> Equivalent to 1 mSv per year, or 30 mSv per generation (30 yrs). 
©Values rounded. 
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uncertainty, owing to the lack of dose-response data for radiation-induced 
mutations in human germ cells and inadequate knowledge of the mutational 
component of many common disorders. No estimates are available for dis- 
eases of complex inheritance, which comprise the largest category of geneti- 
cally related diseases (Table 2). 


CARCINOGENIC EFFECTS 


Ionizing irradiation causes neoplasms of many types in humans and laboratory 
animals (57, 87, 97). Such neoplasms characteristically take years or decades 
to develop and possess no features by which they can be distinguished 
individually from those induced by other causes. Moreover, with few ex- 
ceptions, their induction has mainly been detectable after relatively large 
doses (>0.5 Sv) (50 rem) and has varied with the type of neoplasm, as well as 
with the age and sex of the exposed population (57, 87). 

The available data indicate that radiation carcinogenesis is a multistage 
process and that the dose-incidence curves for different neoplasms can differ 
in shape, as well as in slope. For certain human neoplasms, the dose- 
incidence data are compatible with linear, nonthreshold relationships, but 
other relationships cannot be excluded (57). Assessment of the extent to 
which the risk of cancer may be increased by low-level irradiation is therefore 
dependent on the use of extrapolation models, based on assumptions about the 
relevant dose-incidence relationships. 

Of various mathematical models that have been used to assess the risks of 
low-level radiation, the one judged to provide the best fit to the available data 
is of the form: 


R(d) = Ro [1 + f(@Mgtd)) l. 


where Ro denotes the age-specific background risk of death from a specific 
type of cancer; d denotes the radiation dose; f(d) denotes a function of dose 
that is linear for cancers other than leukemia and linear-quadratic for leukemia 
[i.e. fd) = ajd or fid) = ajd + ajd’); and g(b) denotes a risk function 
dependent on other parameters, such as sex, age at exposure, and time after 
exposure (57). Somewhat different risk models were used by the United 
Nations Scientific Committee on the Effects of Atomic Radiation (UN- 
SCEAR) in its latest assessment (87); namely, a simple additive risk model of 
the form: 


R(d) = Ro + f(d) 
and a simple multiplicative risk model of the form: 


R(d) = Ro [1 + f(~d) 








134 UPTON, SHORE & HARLEY 


both of which assumed the risk to vary as a linear nonthreshold function of the 
dose. 

The estimates of lifetime radiation-induced cancer risks derived with the 
above three models provide a range of projections (Table 3). As emphasized 
by the Biological Effects of Ionizing Radiation (BEIR) V Committee, the 
upper limit of the estimates is about twice the values shown, and the data do 
not “rigorously exclude the existence of a threshold,” or zero risk, in the low 
(mSv) dose domain (57). 

The tabulated risk estimates are appreciably higher than the “preferred” 
estimates that were published by the BEIR III Committee (60) in 1980, owing 
primarily to revised estimates of the doses of A-bomb radiation received by 
the survivors of Hiroshima and Nagasaki; the use of a multiplicative risk 
projection model in preference to an additive projection model; and the use of 
a linear dose-incidence model for all cancers other than leukemia, rather than 
a linear-quadratic model for all cancers (57). On the other hand, the tabulated 
estimates are not appreciably higher than those derived by the BEIR I 
Committee in 1972 (61), which were based on the use of analogous risk 
models. Thus, the tabulated values may provide a reasonably stable range of 
estimates, given the existing uncertainties in the data. The source of un- 
certainty that accounts for most of the difference between the upper and lower 
estimates concerns the extent to which the risks of radiation-induced cancers 
can be expected to remain elevated long after irradiation. Although the risks 
of leukemia reach a peak and then decrease within 25 years after irradiation, it 


Table 3 Estimated lifetime risks of cancer attributable to 
0.1 Sv rapid whole-body irradiation 








Type or Site of Cancer Cancer Deaths Per 100,000* 





Lung 60°-170° 
Stomach 90°-130° 
Leukemia 90°-100° 
Colon 30°— 80° 
Breast (female) 20°- 40° 
Urinary tract 20°— 404 
Esophagus 20°— 30° 
Ovary 20°- 20! 
Multiple myeloma ig 29° 
Thyroid 10° 15° 
Remainder 90°--135° 

Total 460°-780° 





* Values (rounded) for a population of both sexes and all ages at time 
of irradiation. 

> Estimate based on simple additive risk model (87). 

© Estimate based on modified multiplicative risk model (57). 

“Estimate based on simple multiplicative risk model (88). 
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is not known how long the risks of other types of cancer may remain elevated. 
The extent to which the risks of cancer in persons irradiated during childhood 
may remain elevated after they have reached the age at which cancer becomes 
prevalent in the general population can be determined only by further long- 
continued follow-up of the atomic bomb survivors and other suitable pop- 
ulations. 

Apart from the uncertainty about the length of time that the risk of cancer 
may remain elevated after irradiation, other issues that complicate assessment 
of the risks of low-level irradiation include uncertainty about the shape of the 
relevant dose-incidence curve; uncertainty about the extent to which the 
carcinogenic effects of a given dose may be influenced by variations in its 
distribution within time and space; uncertainty about the degree to which the 
risks may vary with age at irradiation, sex, smoking habits, diet, and other 
factors that affect susceptibility to carcinogenesis; and uncertainty about the 
accuracy of the dose measurements, diagnoses, and other data on which the 
estimates are based. Because of these issues, the tabulated risk estimates must 
be interpreted with caution. 

Particular caution must be exercised in extrapolating from the estimates 
(Table 3) to predict the risk of cancer following the gradual accumulation of a 
given dose over a period of weeks, months, or years. Experiments on 
laboratory animals have demonstrated that the carcinogenic effects of low- 
LET radiation may be reduced by a factor of two-to-ten if the period of 
exposure is sufficiently prolonged (68, 88). The fact that no comparable 
reduction occurs with high-LET irradiation implies that the decrease does not 
result merely from age-dependent changes in susceptibility (88). In the ab- 
sence of adequate human data on the comparative carcinogenicity of pro- 
tracted low-LET irradiation, the UNSCEAR Committee (87) and the BEIR V 
Committee (57) were unable to specify the extent to which their projections 
may overestimate the risks of a dose of radiation that is accumulated over long 
periods of time. Insofar as the animal data are predictive for humans, the 
tabulated risk estimates (Table 3) may exaggerate by a factor of two or more 
the risks of cancer attributable to irradiation at low dose rates. 

As noted above, radiation-induced cancers do not possess any characteris- 
tics by which they can be distinguished individually from those arising 
through other causes. However, the probability that a given cancer may have 
resulted from previous irradiation can, in principle, be estimated on the basis 
of the aforementioned risk models, if the radiation dose, age at exposure, and 
time since exposure are known and if the susceptibility of the affected 
individual is assumed to be no different from average. This approach, known 
as the probability of causation method, was the basis of a report mandated by 
the US Congress to provide a scientifically defensible method for evaluating 
compensation claims filed by citizens exposed to radioactive fallout down- 
wind from the Nevada test site (59, 69). 
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TERATOGENIC EFFECTS 


Since the pioneer observations of Bergonie and Tribondeau early in this 
century, we have known that the embryo is highly radiosensitive. During 
critical stages in organogenesis, which characteristically occupy a few days or 
weeks for each organ, irradiation of the embryo causes various malformations 
and other developmental disturbances. Such teratogenic effects are generally 
thought to result from radiation-induced injury or death of substantial numb- 
ers of cells, and not to result unless an appreciable threshold dose is exceeded. 
Nevertheless, malformations of many types have been produced in laboratory 
animals by doses as low as 50 mGy (5 rem) delivered during critical stages of 
organogenesis (88). A limited number of malformations in human infants has 
also been clinically associated with prenatal irradiation in the past, following 
the use of older radiological techniques and equipment that delivered higher 
doses to the embryo than are received in current practice (10). 

Among the teratogenic effects observed in humans, especially noteworthy 
is a dose-dependent increase in the frequency and severity of impairment in 
brain development in A-bomb survivors who were irradiated between the 
eighth and the fifteenth weeks after conception. The pertinent data do not 
define the shape of the dose-effect curve, but are compatible with a nonthresh- 
old function for impairment of intelligence and for severe mental retardation 
(88). In this cohort (and, to a lesser extent, in the cohort irradiated between 
the sixteenth and the twenty-fifth week after conception) there is also a 
dose-dependent downward shift of IQ test scores, which amounts to as much 
as 25 points per Sv (100 rem) (57, 88). The data are not robust enough to 
define whether there is a theshold for this effect. 


PUBLIC HEALTH IMPLICATIONS 
Radon 


Environmental radon (7??Rn) exposure was first recognized as a significant 
carcinogen in 1984. However, follow-up studies of underground miners had 
previously indicated that radon at the high concentrations encountered in 
mines of many types (i.e., iron, zinc, lead, fluorspar, and uranium) was the 
primary cause of the elevated incidence of lung cancer in such workers (66, 
67). The basic data for estimation of the lung cancer risks come from 
follow-up studies of four groups of underground miners: uranium miners in 
Colorado, Ontario, and Czechoslovakia, and iron miners in Malmberget, 
Sweden (58, 66). From the excess lung cancer risk (above that expected from 
smoking) in miners, we can estimate the resulting residential lung cancer risks 
by modeling lifetime radon exposures in the home, as the dose per unit radon 
exposure is essentially the same in homes as in mines (31, 56). In the three 
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main published estimates of the risk of lung cancer arising from environmen- 
tal radon exposure (36, 58, 66), the risk models have differed significantly in 
basic assumptions, but the quantitative estimates of lung cancer risk have 
agreed within a factor of three (Table 4) and are consistent with the pre- 
liminary results of case-control studies on the risks arising from environmen- 
tal exposure (5). 

In view of the risks associated with exposure to radon, and the fact that 
surveys in Sweden, the UK, the Federal Republic of Germany, Canada, and 
the US imply that radon contributes more than 50% of the annual effective 
dose equivalent to the population from natural sources (Table 1), many 
countries have issued guidelines for limiting exposure to radon. These coun- 
tries have also recommended methods for reducing long-term exposure to 
radon in future housing stock (22). 


Environmental Background Radiation Other than Radon 


Exclusive of the dose to the bronchial epithelium from radon, the average 
dose of radiation received by members of the US population from other 
natural sources approximates | mSv (100 mrem) per year (Table 1). The 
risks, if any, that may be associated with this level of irradiation can be 
estimated only by extrapolation, based on the dose-response models discussed 
above. The risk estimates for heritable disorders, presented in Table 2, imply 


that only a small percentage of all such diseases is attributable to natural 
background irradiation; however, the estimates are fraught with uncertainty, 
for reasons already mentioned. The corresponding risk estimates for 
carcinogenic effects imply that no more than 1-3% of all cancers in the 
general population are caused by natural background irradiation (57). 
Epidemiological studies of the extent to which the rates of disease vary 


Table 4 Estimated lifetime risk of lung cancer attributable 
to continuous life-long exposure to indoor ???Rn at a con- 
centration of 150 Bq per m? (4 pCi per liter)’ 








Source (Reference) Lifetime Risk (%)° 





National Council on Radiation Pro- 0.9 
tection and Measurements (66) 
International Commission on 1.6 
Radiological Protection (36) 
National Academy of Sciences (58) 3.4 (Men) 
1.4 (Women) 





* 150 Bq per m* (4 pCi per liter) corresponds to the EPA guideline 
for maximum acceptable 7’Rn concentration in indoor air. 

> Risk values shown apply to the general population, including 
both smokers and nonsmokers, unless otherwise specified. 
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correspondingly in relation to natural background radiation have yielded 
results that are generally consistent with the above estimates. The largest such 
study, which involves a sizable population residing in an area of elevated 
natural background radiation in Yanjiang County, China, has failed to detect a 
significant variation in disease frequencies attributable to differences in nat- 
ural background radiation levels. However, this study did find the frequency 
of cytogenic abnormalities in the circulating lymphocytes of persons in the 
high-background area to be increased (102). Studies of disease rates in other 
high-background areas in the US, England, and other countries have produced 
varying results, which must be considered inconclusive because of the possi- 
ble influence of confounding factors (57). Although several studies have 
found that the rates of cancer and other diseases vary inversely with natural 
background radiation levels, which some investigators have interpreted as 
evidence of beneficial (or “hormetic”) effects of low-level irradiation, the 
relationship does not persist after the effects of altitude and other confounding 
variables have been adequately controlled (57, 105). 

The occurance of clusters of childhood leukemia in the vicinity of Sella- 
field, Dounreay, and other nuclear installations in the UK has aroused con- 
cern over the possibility that the leukemias may have been caused by radiation 
released from the plants (6, 14). The releases are estimated to have increased 
the total radiation dose to surrounding populations by less than 2% (19), 
however, thus prompting the search for other possible causes. The possibility 
of an infective etiology has been suggested by the finding of a comparable 
excess of childhood leukemia in the New Town of Glenrothes, which contains 
no radiation facility but otherwise resembles Sellafield, Dounreay, and other 
nuclear plant sites that have recently experienced a large influx of population 
(41). Also supporting the hypothesis that some factor other than radiation may 
be responsible for the leukemia excesses is the finding that villages near 
potential sites of nuclear plants in the UK have shown similar excesses of 
leukemia (15). 

An additional putative explanation has emerged from a case-control study 
by Gardner et al (24), which suggested that the excess leukemias near the 
Sellafield plant may have resulted from occupational irradiation of the fathers 
of the affected children. This inference, although based on a statistically 
significant odds ratio, rests on only four cases with significantly exposed 
fathers. Arguing against this interpretation is the fact that children of A-bomb 
survivors who were conceived postirradiation have shown no excess of 
childhood leukemia (111); mutations are not induced by radiation at a high 
enough frequency to account for the leukemias in question (1); and 
epidemiological studies of other populations in the UK have failed to confirm 
the association between paternal irradiation and the occurrence of childhood 
leukemia (50, 98). It is noteworthy that the rates of mortality from childhood 
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leukemia in US counties that contain nuclear installations have shown no 
excess (38). 

Although the rates of leukemia in southwestern Utah have increased in 
association with the deposition of radioactive fallout from nuclear weapons 
tests during 1952-1958, no consistent trend for all forms of leukemia or other 
types of cancer has been evident (57). Nevertheless, a significant excess of 
acute leukemia has been reported in those who have received as much as 6-30 
mGy (600-3000 mrad) before age 20 and who died before 1964 (78). The 
excess appears somewhat larger than that which would be predicted on the 
basis of the risk models in Table 3, but the discrepancy is not statistically 
significant (75). 


Occupational Irradiation 


Early radiation workers were among the first to demonstrate carcinogenic 
effects of irradiation (95). Historic examples include carcinomas of the skin in 
pioneer radiologists and x-ray workers (23); leukemias in early radiologists 
(86); osteogenic sarcomas and carcinomas of cranial sinuses in early radium 
dial painters (48); and lung cancers in pitchblende and other underground 
hard-rock miners (99, 106). With the evolution of modern radiation protection 
standards, the occupational risks of such cancers have been drastically re- 
duced, but the extent to which they may still be elevated remains a subject of 
ongoing study. Radiation has thus continued to be one of the most thoroughly 
investigated of occupational carcinogens. 

Of the various forms of human cancer induced by irradiation, leukemia 
(excluding chronic lymphocyte leukemia) has the highest relative risk per unit 
dose (57). However, only one of 17 recent studies of radiation workers has 
found a significant overall excess of leukemia (Table 5). Furthermore, 
although an excess in an isolated subgroup has been reported in one study, a 
significantly positive dose-response relationship for the disease has not been 
observed in any of the nine studies in which individual radiation exposure data 
were analyzed (Table 5). These findings suggest that the average risk of 
leukemia in today’s radiation workers is no larger than would be predicted on 
the basis of the extrapolation models discussed above. 

Multiple myeloma is another form of cancer for which the relative risk 
following low-level irradiation is comparatively high (18). Indeed, the rate is 
elevated in several occupationally exposed cohorts, including workers of the 
Hanford nuclear facility (26), but in the majority of cohorts investigated the 
disease has shown no increased frequency (Table 6). Again, there is little 
reason to conclude that the risk of multiple myeloma in radiation workers is 
any higher than would be predicted on the basis of the extrapolation models 
discussed above. 

Other forms of cancer have also been occasionally reported to be increased 
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Table 6 Estimated relative risks of multiple myeloma in radiation workers, as indicated by 
recent epidemiological studies* 








Mean Tissue 
Study (References) Observed/Expected Ratio Dose (mSv) 





Early US radiologists (49) 11/7.9 : 2400-6000 
Early UK radiologists (74) 0/1.0 0 50-100/yr 
Japanese radiation technologists (4) 2/0.7 : ~=600° 
Chinese x-ray workers (101) 0/1.3 ~=1000° 
US Hanford workers (25, 26) 4/2.1 : 23 

US Oak Ridge National Lab (109) 1/=3.4° =0.3 17 

US Oak Ridge fabrication workers (13) 4/2.8 1.4 =10° 

US Linde fabrication workers (84) 3/3.2 0.9 (?) 

US uranium millers (104) /=1.5° 0.7 (?) 

US Rocky Flats (25, 108) 1/0 x 41 

UK Atomic Energy Authority (8) 3/5.3 0.6 32 

UK Atomic Weapons Establishment (7) 2/3.6 0.6 8 

UK Sellafield (75) 7/4.2 We 124 
Atomic Energy of Canada, Ltd. (34) 1/2.1 0.5 47 

US radium-dial painters (76) 6/2.2 2.8 26 





* Portions of this table are adapted from Miller & Beebe (51) or Cuzick (18). 
® = denotes approximate values estimated for this tabulation. 


in frequency in some cohorts of radiation workers. Mortality from lung cancer 


and leukemia, for example, increased with increasing dose in workers of the 
Oak Ridge National Laboratory (109). However, data on the smoking habits 
of the workers were not available, the increase in lung cancer was not 
significant in nonmonthly workers, there were no deaths from lung cancer in 
male monthly workers whose accumulated doses exceeded 40 mSv, lung 
cancer mortality in the group as a whole was less than two thirds of that which 
would have been expected from national rates, and the analysis of mortality 
from leukemia did not exclude deaths from chronic lymphocytic leukemia, 
which bear no known relation to previous irradiation. Hence, as has been 
emphasized elsewhere (81), the findings must be interpreted with caution. 
Excesses of other forms of cancer have also been reported occasionally in 
various occupationally exposed cohorts; however, because of methodological 
problems and other sources of uncertainty, the findings are of equivocal 
public health significance (57). 


Radiation Accidents 


In view of the many powerful sources of radiation in the modern world, the 
low frequency with which persons are accidentally irradiated attests to the 
effectiveness of existing safeguards. In spite of elaborate precautions, howev- 
er, some 285 nuclear reactor accidents were reported in various countries 
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between 1945 and 1987 (excluding the Chernobyl accident), thus resulting in 
the exposure of more than 1350 persons, with 33 fatalities (46). In most such 
accidents, the public has not been directly affected, but in each of the two 
most serious recent reactor accidents—the one at Three Mile Island (TMI) in 
1978 and the one at Chernobyl in 1986—enough radiation material was 
released into the environment to pose a potential threat to the health of off-site 
populations. 

Fortunately, in the TMI accident the largest dose to anyone residing near 
the reactor was no larger than the annual dose normally received from natural 
background irradiation (96). Not unexpectedly, therefore, the accident re- 
sulted in no detectable increase in birth defects or infant mortality in the 
surrounding population (85). Nevertheless, those living in the vicinity of TMI 
have shown evidence of persistent psychological stress (11), which has been 
implicated as a possible explanation for the temporary rise in the annual 
incidence of cancer during 1982-1984 in those residing closest to the plant 
(33). Because there was no associated increase in cancer mortality in this 
population, the possibility that the rise in incidence may have resulted from 
early ascertainment, as a consequence of heightened surveillance prompted by 
postaccident concern, has also been suggested (33). 

In the Chernobyl accident, a far larger release of radioactivity occurred, 
thus necessitating the evacuation of tens of thousands of people and farm 
animals from the surrounding area. The accident caused a collective dose 
commitment to the population living within 30 km of the plant that is 
estimated to approximate 16,000 person-Sv (1,600,000 person-rem) (20). A 
dose commitment of this magnitude can be projected, on the basis of the 
aforementioned risk models (Table 3), to increase the lifetime risk of cancer 
in the population by as much as 4-8%, i.e. to multiply the natural risk by a 
factor of 1.04-1.08. Beyond 30 km, the accident resulted in a collective dose 
equivalent commitment to the Northern Hemisphere of approximately 
600,000 person-Sv (60,000,000 person-rem) (87), which can be projected to 
cause as many as 30,000 excess cancers within the next 70 years, more than 
one third of which would occur in Byelorussia alone. Although such a large 
number of excess cancers would be a public health disaster, the projected 
excess in Byelorussia corresponds to less than the 1% of the cancers expected 
to occur there “spontaneously” during the next 70 years. Except in the most 
heavily exposed subpopulation, therefore, the excess is unlikely to be detect- 
able. 


Medical Irradiation 


Diagnostic medical irradiation accounts for the largest part by far—over 
80%—of population exposure to radiation from man-made sources (Table 1). 
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Although the use of radiography for diagnostic purposes has now been widely 
accepted, mammographic screening for breast cancer in asymptomatic 
women was challenged initially because its benefits were not well 
documented and the doses it delivered to the breast were appreciable. In the 
past 20 years, improved film-screen and xeromammographic methods have 
reduced breast doses significantly. Meanwhile, the evidence from several 
randomized trials has clearly shown reductions in breast cancer mortality of 
roughly 30% from mammographic screening for women over age 50, and the 
benefits in terms of mortality prevented are five to 100 times as great 
(depending upon age and dose) as the corresponding radiation risks (65). For 
women under age 50, the evidence for benefits from mammographic screen- 
ing remains equivocal (16), an issue that needs to be resolved in view of the 
recommendations by some organizations to conduct routine mammographic 
screening at ages 40-49, although the ongoing randomized mammographic 
trials may contain too few women under age 50 to resolve this issue. In 
women with a familial or personal history of breast cancer, however, it is still 
recommended that mammographic screening begin at age 35 (16). 

The thyroid gland appears to be unusually susceptible to radiation carci- 
nogenesis, especially in childhood, judging from the effects of acute irradia- 
tion in A-bomb survivors and in patients treated with x-rays to the head and 
neck (57). Medical exposure to iodine-131 ('*'I), which concentrates in the 


thyroid gland and has a seven-day effective biological half life, is therefore a 
matter of concern, as is the potential for exposure to '*'I released from nuclear 
plant accidents or in nuclear bomb fallout. Also of concern is the potential for 


exposure to '*°I, which is widely used for laboratory purposes in science, 


medicine, and industry. Although one animal study has suggested that '*'I is 
no less effective than x-rays in inducing thyroid cancer (43), the human data 
suggest that it may be three to ten times less effective (73). Unfortunately, 
because the available follow-up data on children who have received '*'I are 
sparse, there is considerable uncertainty in the estimate. The experience of the 
USSR population exposed to moderate or large doses of '*'I in the Chernobyl 
accident can advance our understanding of the effects of '*'I if investigated by 
carefully controlled, long-term studies. 

Early reports by Stewart (79) and others, which suggested that the induc- 
tion of childhood leukemia by in utero exposure to only a few mSv from 
maternal abdominal diagnostic x-ray examinations, were controversial be- 
cause no excess leukemia had been seen in Japanese A-bomb survivors who 
were irradiated in utero (60). The earlier case-control studies have since been 
essentially confirmed by a large cohort study and several studies of twins (32, 
52, 53). The fact that adult cancers are now also occurring with increased 
frequency among prenatally exposed A-bomb survivors (110) attests further 
to the radiosensitivity of the fetus. 
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IMPLICATIONS FOR EXPOSURE LIMITS 


With the abandonment of the threshold hypothesis for mutagenic and 
carcinogenic effects of radiation, the setting of permissable exposure limits 
has become inextricably linked to assessment of the risks of such effects at 
low levels of exposure. In recognition of the possible existence of such risks, 
contemporary radiation protection practices are guided by the following 
principles (35): 


1. Justification: Any activity that causes radiation exposure should produce a 
sufficient benefit to offset the harm it may cause. 

. Optimization: For any source of radiation, the likelihood of exposure and 
the dose it may deliver should be as low as reasonably achievable, and 
economic and social factors should be considered. 

. Dose limits: The dose to each individual, as well as the likelihood of 
exposure from all sources, should be subject to control. 


In keeping with these principles, the system of radiation protection that has 
evolved includes exposure limits for every organ of the body. Furthermore, 
because the latest risk estimates (Table 3) imply that annual exposure of 
radiation workers to 50 mSv (5 rem) per year (the present maximum permiss- 


ible dose limit) would ultimately increase the lifetime risk of cancer in such 
workers by more than 30%, the International Commission on Radiological 
Protection has recommended that the dose limit for workers be reduced to 20 
mSv (2 rem) per year averaged over a period of five years, with no more than 
50 mSv (5 rem) in any one year (35). 

Although these new recommendations will probably not affect the majority 
of radiation workers greatly, as their exposures are already well below the 
present maximum permissible dose limit, implementation of the recom- 
mendations could reduce the level of exposure for the small subpopulation 
(<1%) of workers who now approach the dose limit repeatedly. In reducing 
the exposure of this subpopulation, it is important that its collective exposure 
not merely be redistributed over a larger fraction of the total workforce, as the 
same numbers of radiation-induced cancers would ultimately be expected to 
result. 

Reduction of the dose to the patient is an important goal in the medical uses 
of radiation, because a 15% reduction in exposures from diagnostic radiology 
would reduce the total exposure of the population as much as the elimination 
of all other man-made sources of ionizing radiation (100). Perhaps the 
greatest room for improvement lies in the disparity between the state of the art 
in administering diagnostic radiation and the prevailing practices of the 
radiological medical community as a whole (21). A national survey of 
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diagnostic exposures has shown that a small but significant fraction of 
radiologic practices delivers doses in common radiologic diagnostic pro- 
cedures that are more than five times larger than the norm (40). Also, a recent 
report has noted that, in spite of technological advances, there has been 
relatively little change in patient diagnostic exposure levels since the 1970s 
(62). Finally, as noted above, measures to limit residential exposure to radon 
also are indicated. 
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INTRODUCTION 


The health effects of exposures related to fighting fires has long been a major 
interest of occupational health investigators. Municipal firefighters are an 
unusually accessible and well-documented group of workers, as there are 
extensive records on their health and work history. The occupation has been 
studied intensively for evidence of chronic health effects. Interest in the health 
problems of firefighting increased considerably during the 1980s. A sub- 
stantial body of work is now available that may lead to a reevaluation of many 
unresolved issues. 

Firefighters are exposed to serious chemical and physical hazards, to a 
degree that is unusual in the modern work force. The acute hazards of 
firefighting, primarily trauma, thermal injury, and smoke inhalation, are 
obvious. A large literature has been developed on acute pulmomary injury 
associated with inhalation of hot air and toxic constituents of smoke, particu- 
larly the combustion products of commonly used plastics (18, 30, 63). The 
hazards of carbon monoxide and cyanide are particularly well recognized (4, 
18). Although the acute health effects of these life-threatening hazards and the 
risk of physical injury in structures affected by fire are indisputable, the 
chronic health effects that follow recurrent exposure are not clear (3). Studies 
that directly address the health experience of firefighters have not yielded 
consistent results until relatively recently. This uncertainty has led to a 
patchwork of employment and workers’ compensation board policies. 
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Firefighting is an unusual occupation, as it is perceived as dirty and 
dangerous, but indispensable and admirable. Firefighters, almost universally, 
enjoy public admiration and gratitude to a degree unmatched by other occupa- 
tions, particularly in the public sector (90). Their occupation is rich in stories 
of personal courage, spirit in the face of adversity, and teamwork. However, 
firefighters also experience a constant awareness of imminent danger and the 
feeling that the next alarm may challenge them to the limit (100). The health 
of firefighters, and their willingness to face the hazards, cannot be fully 
understood without appreciating this psychological dimension (105). 


HAZARDS 


Occupational hazards experienced by firefighters may be categorized for 
convenience as physical, thermal and ergonomic, chemical, and psycholog- 
ical. The level of exposure experienced by a firefighter in a given fire depends 
on what is burning, the combustion characteristics of the fire, the structure on 
fire, the presence of nonfuel chemicals, the measures taken to control the fire, 
the presence of victims requiring rescue, and the position or line of duty held 
by the firefighter while fighting the fire. The hazards and levels of exposure 
experienced by the first firefighter to enter a burning building are different 
from those of the firefighters who enter later or who clean up after the flames 
are extinguished. However, the career exposure profiles of firefighters tend to 
average out the longer they spend in a particular rank. There is rotation among 
the active firefighting jobs in each platoon and a regular transfer of personnel 
between fire halls. Firefighters therefore have a similar probability of expo- 
sure in typical fire situations as long as they stay classified as a “firefighter”; 
“captains” accompany and direct the crews, but are still actively involved in 
fighting the fire on site. Thus, firefighting exposures tend to become similar 
over a longer period of time, although individual firefighters may still experi- 
ence unusual exposures in particular incidents. 

Within the last 20 years, the introduction of the self-contained breathing 
apparatus (SCBA) and other protective equipment has created a much safer 
working environment for the firefighter. However, the added weight of the 
equipment increases the physical exertion required. The protective clothing 
also becomes much heavier when it gets wet. 


Thermal Hazards 


Heat stress is compounded in firefighting by the combination of insulating 
properties of the protective clothing and physical exertion, which results in 
endogenous heat production (6). 

Hot air alone is not usually a great hazard to the firefighter. Air heated 
above body temperature cools as it passes through the larynx (38). Dry air 
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also does not have much capacity to retain heat and delivers little to the lower 
respiratory tract, so that heat-induced inhalation injury is not usually a risk 
when the air is dry. However, inhaled steam or hot wet air can cause serious 
burns to the lower airway simply because of the high latent heat capacity. 
Much more heat energy can be stored in water vapor than in dry air. 
Fortunately, steam inhalation is not common (91). 

Radiant heat is typical of a fire situation and may be associated with skin 


changes, particularly erythema and telangiectasia, in the absence of obvious 
burns (107). 


Chemical Hazards 


Firefighters on the scene of a fire are frequently exposed to carbon monoxide, 
hydrogen cyanide, nitrogen dioxide, sulphur dioxide, hydrogen chloride, 
aldehydes, and such organic compounds as benzene (41, 59, 113). Before 
arriving and on return, firefighters are exposed to diesel exhausts at the fire 
station (37). 

In the 1970s, about 80% of injuries of firefighters in service resulted from 
smoke inhalation or oxygen deficiency. More than 50% of fire-related fatali- 
ties are the result of smoke exposure, rather than burns (4, 18, 30, 77). One of 
the major contributing factors to mortality and morbidity in fires is hypoxia 
because of oxygen depletion in the affected atmosphere, which leads to loss of 
physical performance, confusion, and inability to escape (91). Another factor 
is the toxicity of the constituents of smoke, singly and in combination. 

The study of the toxicology of smoke as a complex mixture and its 
individual constituents has advanced in recent years and has led to better 
designs for personal protective and fire management strategies. Smoke is a 
variable mixture of compounds, each possessing specific toxicological prop- 
erties and contributing to interactive toxic effects. Therefore, the toxicity of 
smoke varies greatly , depending primarily on the fuel, the heat of the fire, and 
whether or how much oxygen is available for combustion. However, all 
smoke, including that from simple wood fires, is hazardous and potentially 
lethal with concentrated inhalation (26). The complexity of the chemical 
composition of smoke is also due, in part, to the presence of secondary 
products; after the products of combustion are formed, they remain chemical- 
ly active and continue to react long after the fire has ceased to burn. The list of 
chemicals of toxicological concern is long (21, 26). Smoke from burning oil 
has been characterized and found to have mutagenic activity in in vitro assays 
(5); this is undoubtedly also true for other common types of fires. 

Smoke is made up of two components, particulates and gases, which are 
suspended or dissolved in a third component, hot air. (Table 1) The degree of 
exposure experienced by a firefighter is determined by the chemistry and 
quantity of gases produced at the fire, the concentrations reached, the size 
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Table 1 Products of combusion of commonly burnt materials 








Fuel component of Toxic decomposition 
Combusted material original material products* 





wood, paper, cotton, jute cellulose aldehydes, acrolein 
clothing, fabric, blankets, wool, silk hydrogen cyanide, ammonia, 
furniture hydrogen sulfide 
tires rubber sulphur dioxide, hydrogen 
sulfide, methyl mercaptan, 
benzene-related compounds 
upholstery material, wire, pipe polyvinyl chloride hydrogen chloride, phosgene 
coating, wall, floor, furni- 
ture coverings 
insulation, upholstery material polyurethane hydrogen cyanide, isocya- 
nates, oxides of nitrogen 
clothing, fabric polyester hydrogen chloride 
upholstery material, carpeting polypropylene acrolein 
appliances, engineering plastics  polyacrylonitrile hydrogen cyanide, nitriles, 
oxides of nitrogen 
carpeting, clothing polyamide (nylon) hydrogen cyanide, ammonia, 
oxides of nitrogen 
household and kitchen goods melamine resins hydrogen cyanide, ammonia, 
formaldehyde, oxides 
of nitrogen 
aircraft windows, textiles acrylics acrolein 
kitchen goods, electrical insula- polytetrafluorethylene octafluoroisobutylene 
tion, gaskets (Teflon®) 
photographic film nitrocellulose oxides of nitrogen 





*Carbon monoxide and carbon dioxide are produced in all cases. Other gases, such as the aldehydes, 
methane, and low-molecular-weight organic acids, are common in most fires. 


distribution of the particulate phase, solubility properties of the gaseous 
constituents as a predictor of the degree of penetration to the lower respiratory 
tract, and the duration of exposure (24). 

Particulates generated from burning wood or other organic matter are 
composed of chemically inert carbon particles that become adsorbed (coated) 
with other chemical substances. These agents produce an irritating effect and 
cause cough and an acute bronchitis. The particles may also become carriers 
of less volatile substances, such as chlorinated hydrocarbons, depending on 
the composition of the combustion products (16, 91). 

The chemical components of the gaseous phase of smoke, many of which 
are adsorbed onto the particulates and thereby penetrate more deeply into the 
lower respiratory tract than they otherwise would, are responsible for most of 
the toxicological effects that result from smoke inhalation (38). Laboratory 
simulations of different fire conditions have permitted the characterization of 
many of the combustion products given off by burning natural and synthetic 
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materials commonly found in building structures and furnishings. The exact 
composition of the combustion products vary, depending upon the composi- 
tion of the burning material and the temperature at which each material 
undergoes thermal decomposition (54). 

Carbon monoxide is considered the most common, characteristic, and 
serious acute hazard of firefighting. Carboxyhemoglobin accumulates rapidly 
with duration of exposure as a result of the affinity of carbon monoxide for 
hemoglobin, and high levels may result, particularly when heavy exertion 
increases minute ventilation and increases delivery of carbon monoxide to the 
lung during unprotected firefighting (42, 62, 98). Not surprisingly, levels of 
carbon monoxide measured on the scene at fires often exceed the Occupation- 
al Safety and Health Administration’s short-term exposure levels (8). Carbon 
monoxide and carbon dioxide, which are natural products of combustion, are 
necessarily present at every fire. Hydrogen cyanide is also formed from the 
lower temperature combustion of nitrogen-rich materials, including such 
natural fibers as wool and silk and such common synthetics as polyurethane 
and polyacrylonitrile (92, 109, 115). Although elevated levels of thiocyanate 
as a marker for cyanide exposure are less common among firefighters than 
elevated carboxyhemoglobin levels (61), there is a close relationship between 
the two in firefighters who have sustained clinically significant smoke inhala- 
tion (20). 

Light-molecular-weight hydrocarbons, aldehydes (such as formaldehyde), 
and organic acids may be formed by hydrocarbon fuels that burn at lower 
temperatures (78). The oxides of nitrogen are also formed in large quantity 
when temperatures are high, as a consequence of the oxidation of atmospheric 
nitrogen, and in lower temperature fires in which the fuel contains significant 
nitrogen. When the fuel contains chlorine, particularly in the form of poly- 
vinyl chloride (PVC), hydrogen chloride is formed (24, 30). 

Most toxic components of smoke, with the exception of carbon monoxide 
and hydrogen cyanide, are only rarely produced in lethal concentrations. 
Different gas combinations are likely to present different degrees of hazard 
(91). 

The toxic products of polymeric, plastic materials have come under in- 
creasing scrutiny. Since the 1950s, these materials have been used in building 
construction and furnishings in Europe and North America in large amounts 
(83, 114). They were soon found to combust into particularly hazardous 
products. Acrolein, formaldehyde, and volatile fatty acids are common in 
smoldering fires of several polymers, including polyethylene and natural 
cellulose (78). These products were characterized in a series of elegant studies 
by Wooley, who found a close relationship between the temperature of 
combustion and the mix of nitrogen-containing products released from 
polyurethane. Generally, cyanide levels increase with temperature; acryloni- 
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trile, acetonitrile pyridine, and benzonitrile occur in large quantity above 
800°C but below 1000°C (114, 116, 124, 125). Combusted polyacrylonitrile 
is even richer in cyanide and nitriles (116). Polyvinyl chloride has been 
proposed as desirable polymer for furnishings because of its self- 
extinguishing characteristics, caused by the high chlorine content. Un- 
fortunately, the material produces large quantities of hydrochloric acid when 
fires are sustained, as they are when PVC is only part of the fire (33). 

Since the 1970s, there has been much interest in the relative toxicity of the 
mixed products of combusted materials, along with the recognition that no 
two fires are exactly alike (49, 54). Alarie and Anderson (1, 2), have 
conducted extensive research on the decomposition products of polymeric 
materials, by comparing the toxic effects to those of the wood of Douglas fir 
as a standard. Some materials, such as PVC, decompose rapidly and are 
rapidly lethal, but others, like polytetrafluoroethylene (PTFE), which also 
decomposes rapidly, kill more slowly even though they are ultimately more 
lethal over the duration of exposure than PVC. When rating the toxicity 
hazard of materials, one must consider the period of evaluation. Table 2 rates 
the toxicity of a variety of materials compared with a standard of 100 
representing Douglas fir, and considers both the time and exposure level 
required to produce a lethal effect (2). 

Within the last 20 years, the toxicity of smoke and its hazard to the 
firefighting profession have been fully recognized. In the early 1970s, the 


Table 2 Index for burning components of original 
products summarizing several characteristics that con- 
tribute to or suppress toxicity (lower the index, the 
higher the toxicity) 








Burning material Index 
Douglas fir 100.00 
compressed spruce, pine, fir slab 65.92 
fiberglass insulation 63.59 
polyester resin 41.58 
cellulose fiber 17.80 
polyurethane foam 12.95 
phenol formaldehyde-phenol resin 8.98 
isocyanate foam 7.57 
modacrylic 

PVC 

wool 

polystyrene 

acrylonitrile/butadiene/styrene 

urea formaldehyde foam 

PTFE 
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National Fire Prevention Association published a report, “Breathing Appara- 
tus for the Fire Service,” which, through increased recognition of the prob- 
lem, led to the introduction of SCBA (127). Fire departments, such as 
Boston’s, adopted mandatory use regulations that markedly reduced the 
number of smoke inhalations in subsequent years by as much as 80% (113). In 
1976, an evaluation of the effectiveness of SCBA showed that blood levels of 
carboxyhemoglobin in firefighters, as a measure of carbon monoxide expo- 
sure, were lowest in firefighters who used SCBAs, but roughly the same for 
intermittent uses or nonusers. This study and others showed that firefighters 
who do not wear SCBAs during the “knock-down” and “overhaul” phases are 
at risk; unfortunately, this is the phase in which the flames are out, and the 
hazard only seems to be reduced (30, 93). 

Firefighters judge the level of hazard they face by the intensity of smoke 
and decide whether to use an SCBA solely on the basis of what they see. This 
may be very misleading, especially in the clean-up phase after the flames are 
extinguished (112). On superficial inspection, the fire setting may appear safe 
at this stage; however, it can be dangerous (112). There is no apparent 
correlation between the intensity of smoke and the amount of carbon monox- 
ide in the air (23, 60). Synthetic materials are most dangerous during smolder- 
ing conditions, as opposed to conditions of high heat (30, 53, 123). Concrete 
retains heat very efficiently and may act as a “sponge” for trapped gases that 
then out-gas from the porous material, thus releasing hydrogen chloride or 
other toxic fumes long after a fire has been extinguished (30, 108). Firefight- 
ers should also be cautioned against cigarette smoking during the clean-up 
phase, as this adds to the already elevated levels of carbon monoxide in the 
blood (65, 110). The hazards presented by unusual constituents in smoke are 
too variable and complex for detailed discussion here. The combustion pro- 
ducts of many industrial chemicals and mixtures are unknown or only poorly 
characterized. A major problem with fires that involve chemical sources or 
storage facilities is knowing how to fight them with the least potential danger 
to the firefighter and local residents. Recent studies have suggested novel 
approaches to such situations. In the case of fires in an insecticide storage 
facility, Jeffries & Schiefer (52) have suggested that optimal management 
might be to enhance combustion intentionally to produce a fast, hot fire that 
combusts the material more completely and carries the airborne toxic products 
away from the vicinity vertically by convection currents. 


Psychological Hazards 


There are many sources of psychological stress in the life of a firefighter, in 
addition to the viscissitudes of daily life and career advancement (29, 70, 72). 
A firefighter regularly steps into a situation that others flee, thus accepting a 
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level of personal risk that would be unacceptable in most other occupations. 
Although this risk is controlled to the extent possible with fire equipment and 
personal protection, the reality of firefighting is that much can go wrong in 
any fire, and the course of a serious fire is often unpredictable (51, 70). 

Besides personal security, the firefighter must be concerned with the safety 
of others threatened by the fire and is sometimes a witness to pain, injury, and 
strong emotion. Rescuing victims is an especially stressful activity. The loss 
of a victim, especially a child, is reported in numerous anecdotes to be the 
most stressful experience a firefighter can endure. 

The professional life of a firefighter is not an endless round of anxious 
waiting punctuated by stressful crises, however. Firefighters enjoy the many 
positive aspects of their work. The work is intrinsically interesting and, 
during alarms, presents a great deal of stimulation and variety. Few occupa- 
tions are so unequivocally favored by public opinion or so respected by the 
community. Job security is largely assured in urban fire departments once a 
firefighter is hired, the pay usually compares well with other jobs, and the 
schedule allows ample opportunities for “moonlighting” between shifts. 
When a firefighter answers an alarm, there is a degree of apprehension and 
stress, but there is also exhilaration and a sense of purpose. These positive 
aspects of the job mitigate the stressful aspects and tend to protect the 
firefighter against the emotional consequences of repeated stress (51). 

At the sound of an alarm, firefighters experience a degree of immediate 
anxiety because of the inherent unpredictability of the situation that they are 
about to encounter. Some investigators believe that the psychological stress 
experienced at this moment is as great and, perhaps, greater than any of the 
stresses that follow during the course of responding to an alarm. En route, 
hazardous traffic maneuvers and high noise levels from the sirens contribute 
to stress (94). Physiological and biochemical indicators of stress have also 
been assessed among firefighters (29). 


HEALTH EFFECTS 


Some jurisdictions attempt to justify cases of health disorders on an individual 
basis, and others prescribe selected chronic diseases as compensable occupa- 
tional disorders among firefighters. The problem is compounded by changes 
in technology over several decades; the risks and their outcomes vary with the 
era during which the firefighters entered the workforce. Generally speaking, 
firefighters who entered service before the 1960s were exposed to smoke of 
less acute toxicity, but lacked personal protection equipment of acceptable 
effectiveness. Those entering within the last two decades have primarily been 
exposed to smoke of greater toxicity, but have had more effective respiratory 
protection available. Firefighters who joined the force in the last few years 
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may have the benefit of both fire-retardant materials produced under more 
stringent safety codes and also respiratory protection meeting contemporary 
standards of effectiveness. 


Acute Effects 


INJURIES Injuries associated with firefighting are predictable: burns, falls, 
and injury from falling objects. Jobs with a high risk of burns include those 
involving early entry and close-in firefighting, such as holding the nozzle. 
Burns are more commonly associated with basement fires, recent prior injury, 
and training outside the fire department of present employment. Falls tend to 
be associated with SCBA use, assignment to truck companies with climbing 
equipment, and, suggestively, childlessness. (Without dependent children, an 
individual may be more likely to take risks.) However, age and experience do 
not seem to be associated with risk of injuries in service (47). 


RESPIRATORY DISORDERS The respiratory effects of exposure to smoke 
and fumes from fires have been a major concern. Acute smoke inhalation 
carries a high mortality for unprotected victims (19) and is often combined 
with burns and other trauma. Fatal and overwhelming smoke inhalation has 
been reviewed extensively in the clinical literature (18, 22), and its man- 
ifestations are not unique to firefighting. 

Transient changes have been associated with unremarkable fires (59, 82, 
100), as well as fires involving certain chemicals (14), such as burning 
polyvinyl! chloride (30), silicone plastic (39), butyl rubber insulation (82), and 
isocyanates (6, 73). In those cases in which the fires did not present an 
unusual hazard, the decrement in airflow (measured as the reduction in FEV,. 
the forced expiratory volume in one second, between the beginning of a shift 
and return after an alarm) correlated with the concentration of particulate 
matter in the smoke cloud and the presence of eye irritation, but not the 
duration of exposure, work shift, or smoking history. Persistent effects, 
including neurological impairment, have been noted following exposures that 
involve isocyanate fumes (6), but in only a subset of firefighters exposed to 
burning polyvinyl chloride (30, 69, 111). In at least one case, these changes 
mimicked asthma, with wheezing and refractory bronchoconstriction (14). 

Airway responsiveness increases after firefighting exposures (56, 82, 100, 
101). Increased airways reactivity following minor smoke inhalation during 
routine firefighting is a complex response, more complicated than 
bronchoconstriction, which results from irritation. The response is persistent, 
does not correlate with baseline methacholine sensitivity, and is associated 
with acute but transient increases in airways responsiveness (82, 100, 101). In 
at least one case, exposure resulted in airflow obstruction—initially respon- 
sive to bronchodilators—that became progressively more severe despite treat- 
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ment, until the patient died of respiratory failure two years later (14). In this 
case, the pathology may have resembled mucoid impaction syndrome. 

There are very few studies of the pulmonary response to smoke in con- 
trolled situations in firefighting. Minty et al (74) studied nonsmoking 
firefighting instructors in the Royal Navy, by obtaining data on smoke 
composition, pulmonary function, and alveolar-capillary permeability follow- 
ing brief exposure to wood and diesel fires in training exercises. They found 
no change in pulmonary function; but, they did find an elevated permeability 
measure, which suggests that exposure to the smoke either damaged the 
integrity of the alveolar-capillary barrier or initiated a low-grade in- 
flammatory response that in turn had the same effect because of release of 
proteolytic enzymes. 


Chronic Health Effects 


Several early studies examined firefighting, along with numerous other 
occupations, by using vital records for large populations (43, 45, 77, 88, 117, 
118, 120, 121). These have sometimes been difficult to interpret because of 
methodological issues and misclassification (43). Two early cohort studies 
have been recognized for their usefulness in evaluating the health risks of 
firefighters: Mastromatteo’s 1959 study on a cohort of firefighters in Toronto 
(71) and Musk et al’s 1978 study on a large cohort from Boston (79). The 


Mastromatteo study was a pioneer effort in the field, which illustrated the use 
of basic techniques that have been adopted in many studies since. Because of 
its date, however, the findings have uncertain application to the current 
situation. Since these early landmarks, other major cohort studies have recent- 
ly been contributed by Eliopulos et al (31) on a cohort from Western Austra- 
lia, Feuer & Rosenman (36) on a cohort from New Jersey, Vena & Fiedler 
(122) on a cohort from Buffalo, New York, Heyer et al (48) and Rosenstock 
et al (96) on a cohort from Seattle, Beaumont et al (13) on a cohort from San 
Francisco, and Guidotti (44) on a cohort from the Canadian province of 
Alberta. 

The chronic effects of greatest concern in studies of firefighters have been 
lung cancer, heart disease, and chronic obstructive pulmonary diseases. 
Recently, however, other forms of cancer, particularly genitourinary and 
colon and rectal, have emerged as likely associations. 

A chronic effect that has only recently been documented is that of an 
increased risk of congenital cardiac defects in the offspring of male firefight- 
ers. The responsible exposure is not known (85). 


CANCER Lung cancer has been the most difficult cancer site to evaluate in 
epidemiologic studies of firefighters. Despite the obvious exposure to carcin- 
ogens inhaled in smoke (15), it has been difficult to document an excess in 
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mortality from lung cancer of a magnitude and consistency compatible with 
occupational exposure. Without question, cigarette smoking is a confounding 
exposure that complicates the analysis, but the prevalence of smoking among 
firefighters does not appear to be excessive compared with other blue collar 
occupations (40). Respiratory protection has probably reduced individual 
exposure levels since the 1970s, although it was not optimally used for many 
years in most fire departments. An effective form of respiratory protection 
was probably introduced too late to have substantially modified lung cancer 
rates that are currently observed. A major issue is whether the above- 
mentioned introduction of synthetic polymers into building materials and 
furnishings has increased the risk of cancer among firefighters because of 
exposure to the combustion products. 

The empirical findings on lung cancer from recent, well-designed 
epidemiological studies have been inconsistent. One study from Denmark 
(46), in which the comparison population is unusual, reported a standardized 
mortality ratio of 317 for older firefighters, whereas studies on cohorts from 
San Francisco and Buffalo showed no excess (13, 122). The possibility that 
an association is obscured, in comparison to the general population by the 
healthy worker effect, is probably less likely for this cause of death than for 
other chronic diseases; over the long periods of observation typical for these 
studies, the mortality experience of initially selected workers can be expected 
to approach that of the general population more closely, especially for noncar- 
diovascular causes of death. Most studies have shown an excess of lung 
cancer on the order of 20-80% (48, 97), a magnitude not uncommon in 
studies of other blue collar occupations with less plausible exposure levels 
(43). In the most detailed analyses to date, a nonsignificant excess showed no 
clear distribution that would be consistent with duration of employment, 
exposure opportunity, or era of entry into the occupation (44, 122). 

Documentation of an association between lung cancer and occupational 
exposure as a firefighter remains elusive; many investigators continue to 
believe that an association exists. Markers of genotoxic effect suggest that 
carcinogenicity is likely to occur (64). An effect probably does exist, but it is 
likely to be heavily obscured by confounding factors and may not be as strong 
as anticipated. 

Other cancer sites have recently emerged as more consistent associations 
with firefighting. Evidence for an association with genitourinary cancers 
seems strong (97, 122). There is a less strong suggestion in the literature for 
colon and rectal cancers and for leukemia, lymphoma, and multiple myeloma 
(97, 122). 


PULMONARY DISEASE Most epidemiological studies of firefighters that 
report on mortality from chronic obstructive airways disease do not show an 
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excess. There has been some concern that comparison to the general popula- 
tion may obscure a relative excess offset by the healthy worker effect. In one 
study, comparison with police showed a nonsignificant excess, but the police 
in this study showed an unusually low mortality (87). 

Although excess mortality and morbidity are difficult to demonstrate, there 
is evidence from serial studies of pulmonary function that firefighters are at 
risk for airways obstruction. Reports of progressive abnormalities in lung 
function among firefighters have suggested as much as a doubling of the 
expected rate of decline in lung function that normally affects aging adults, 
and this difference is associated with an increased frequency of respiratory 
symptoms (87, 103, 106). These reports sparked a wave of concern in the 
mid-1970s because they suggested an eventual appearance of chronic lung 
disease among firefighters (3). This effect is most apparent following un- 
usually severe exposures (119) and is associated with the number of fires 
fought over the first year or so. Declines of this magnitude have been 
associated with an increased risk for chronic obstructive airways disease 
(emphysema or chronic bronchitis) in other populations. Intense exposures 
have produced chronic changes in at least one case (63). If these findings are 
significant, one would expect an increase in mortality from chronic airways 
disorders compared with the general population. The above-mentioned stud- 
ies show no such effect. The cohort study by Musk et al (79), for example, 
showed a standardized mortality ratio of 93, 83 for active firefighters and 101 
for retired firefighters, which is well within the expected range. Significant 
abnormalities in pulmonary function have been reported in current firefighters 
among smokers only (28), and even that seems to represent minimal small 
airways disease in asymptomatic firefighters employed for at least 25 years 
(63). Thus, the weight of evidence suggests that firefighters are not at greatly 
increased risk of chronic respiratory disease unless they experience an unusual 
exposure (82, 126). 

In the past, there had been some concern that firefighters with early lung 
disease leave the occupation and that the remaining firefighters are, therefore, 
selected for respiratory health (89). This effect may be less pronounced than 
initially assumed (104, 106). Fire departments, in effect, protect their own 
most vulnerable members by transferring them into positions with less oppor- 
tunity for exposure, so that career firefighters with mild respiratory impair- 
ment may easily remain employed (102). Transfer patterns within the fire 
department result in a steady exit of those individuals most at risk for decline 
in airflow velocity from active firefighting positions and movement into 
positions in which their duties involved fighting few or no fires. A powerful 
selection bias at work apparently protects firefighters with abnormalities of 
pulmonary function from further exposure (80, 81). 

The contribution of cigarette smoking to the overall picture remains dif- 
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ficult to sort out. Horsfield et al (50) studied 96 British firefighters and 69 
nonsmoking, nonfirefighter control subjects over four years to evaluate the 
progression of their pulmonary function and any respiratory symptoms. The 
firefighters were regularly interviewed with respect to their smoking habits 
and the degree to which they felt affected by exposure to smoke on the job. 
This index was admittedly subjective, but took into account the situations in 
which personal protection may have failed. The authors found no evidence for 
functional abnormality on spirometry; indeed, pulmonary function in these 
firefighters deteriorated at a rate slower than in the controls. They did observe 
a consistent and suggestive pattern of reported symptoms: Symptoms, pre- 
dominantly productive cough, were reported least often among the controls; 
more often among the nonsmoking, smoke-unaffected firefighters; at an 
intermediate frequency among smokers who were smoke-unaffected, as well 
as nonsmoking smoke-affected firefighters; and most often among smoking 
smoke-affected firefighters. Indeed, despite the crudely subjective index of 
occupational smoke exposure, the pattern strongly suggested a multiplicative 
interaction. The authors concluded that occupational exposure to smoke in 
firefighting is a determinant of respiratory symptoms almost as strong as 
cigarette smoking, with which it interacts, but that it does not appear to affect 
pulmonary function given current use of personal protection. 

These investigations were extraordinary in detail, perspicacity, and tenac- 
ity. The picture now seems to be fairly clear: Within the firefighting profes- 
sion, there is an effective, but largely tacit, mechanism that works by ad- 
ministrative means to protect the most vulnerable members. It now seems safe 
to conclude that occupational exposure can indeed cause respiratory disorders 
alone in extreme situations or in combination with cigarette smoking. That 
this was not reflected in greater mortality from respiratory diseases in past 
years may reflect the effectiveness of the administrative measures described 
above. The risk of death from respiratory causes in future will be further 
reduced by increasing compliance with and technical effectiveness of the use 
of personal protection devices. 


CARDIOVASCULAR DISEASE Despite a presumption of occupational 
association in many jurisdictions when a firefighter dies of a myocardial 
infarction, firefighters have not been consistently shown to be at elevated risk 
for death from heart disease. Recent studies suggest that mortality is about 
that expected (13, 27, 44, 46, 96, 122), although some studies have suggested 
elevations of 50% (99). There is ergonomic evidence that some firefighters 
may be stressed to the limit during the exertions of their work. That this stress 
does not result in increased mortality probably reflects a strong healthy 
worker effect and the decreasing levels of exertion required with seniority and 
advancement beyond captain. 





164 GUIDOTTI & CLOUGH 


Two lines of reasoning suggest that cardiovascular disorders may be a 
problem among firefighters. The first is the documented presence of high 
degrees of cardiovascular stress during the response to alarms and the process 
of fighting the fire (10). The second is the known presence of carbon 
monoxide at high concentrations in smoke inhaled by firefighters (4, 18, 63). 
Several experiments have indicated that carbon monoxide exposure reduces 
the threshold for angina. 

The cardiovascular response to an alarm is pronounced. Firefighters show a 
marked increase in heart rate during the response to a fire alarm. This increase 
averages about 50 beats within 30 seconds of the alarm sounding, which 
persists until arrival at the fire. The elevation in heart rate is much greater than 
that which would be expected in response to the exertion alone. During 
firefighting, heart rates of 150-160 beats/min were the norm, but occasional 
peaks of 175-195 occur, especially during the first 3-5 minutes of a fire and 
during stressful and dangerous crises. These are very high levels, associated 
with maximal exertion or anxiety. The response in heart rate does not show 
any consistent association with age or fitness of the firefighter, and there is 
great variation from person to person and in the same person from time to time 
(10, 55). 

Electrocardiogram (EKG) changes suggesting coronary artery disease were 
found by Barnard et al (11) in nine of 90 randomly selected firefighters aged 
40-59 in the city of Los Angeles. This level of prevalence of EKG- 
demonstrable coronary artery disease was comparable to that expected for a 
large group of middle-aged men but this in itself is surprising because 
firefighters, who are selected by stringent criteria for fitness, demonstrated a 
reduced prevalence of cardiovascular risk factors (12). Four of six firefighters 
with EKG abnormalities suggestive of ischemic changes had no evidence of 
advanced coronary artery disease, but three had abnormal left ventricular wall 
function (11). This raised the possibility that firefighters may be at risk for 
nonischemic myocardial injury on the basis of exposure to carbon monoxide 
or elevated circulating catecholamines (8-11). Indeed, of the original group 
tested, two had myocardial infarcts within two days of testing, an alarming 
experience (12). 

Barnard and coworkers (9, 12) also described an ischemic response in 
healthy young men, including firefighters, who engaged in sudden, vigorous 
exercise without warm-up. They suggested a transient mismatch in oxygen 
supply and demand at the subendocardial level caused by temporarily in- 
adequate perfusion for the suddenly increased demand for myocardial ox- 
ygen. Arterial pressure measurement confirmed that the relaxation time avail- 
able for restoring coronary blood supply during diastole was markedly re- 
duced during cold start-up exercise. Although this mechanism is probably not 
a cause of persistent or cumulative myocardial injury, it may play a role in 
unusual and emergent situations that require sudden maximal exertion. 
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The experience of firefighters who were studied in large groups has been 
quite different. Dibbs et al (27) examined a similarly “healthy” group of 171 
firefighters in Boston enrolled in a cohort study on aging effects. They found 
a distribution of risk factors similar to that seen by Barnard and coworkers, 
but the incidence of detectable coronary heart disease and its complications 
over ten years of observation was no different than that for nonfirefighters of 
the same age in the study. The discrepancy suggests that Barnard’s group of 
subjects, and the group studied before him by Felton (35) from the county of 
Los Angeles (distinct from the city, but recruited from the same population) 
differed in some important ways from the Boston firefighters. 


ERGONOMIC ISSUES 


Firefighting is a very strenuous occupation, which is often performed under 
extreme environmental conditions (67). The demands of firefighting are 
sporadic and unpredictable, characterized by long periods of waiting between 
bouts of intense activity. This irregular pattern of activity is an important 
feature of firefighting, as it adds to the component of stress that is probably 
caused by anxiety and responses to psychogenic stress. 

There are several components to the physiological demands of firefighting, 
including energy cost of performing firefighting activities, heat stress associ- 


ated with heat from the fire, and encumbrance by personal protection equip- 
ment. A detailed understanding of the physiological demands of firefighting 
must consider the contribution of each component and changes in each over 
time. For example, the use of personal protection equipment has imposed new 
physiological demands on firefighters, but has removed other demands by 
reducing exposure levels; personal protection equipment is also improving 
over time with advances in technology (66). 


Energy Costs and Performance 


Among common firefighting activities, climbing the aerial ladder is one of the 
most strenuous. Other strenuous activities include climbing stairs, dragging 
hose, rescuing a victim, and raising the ladder (58, 84, 95). 

Firefighters adjust their levels of exertion in a characteristic pattern during 
simulated fire conditions, as reflected by heart rate. Initially, their heart rate 
increases rapidly to 70-80% of maximal within the first minute (68). As firefight- 
ing progresses, they maintain their heart rates at 85-100% maximal until the fire 
is out. With the addition of equipment and SCBA apparatus, they adjust their 
levels of exertion to remain at this intense level of activity. In other words, 
firefighters maintain their level of exertion at a relatively constant, intense level 
once active firefighting begins. Any additional burden, such encumbrance by the 
necessary protective equipment or victim rescue, reduces performance because 
firefighters are already exerting themselves to the maximum. 
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The energy requirements for firefighting are complicated by the adverse 
conditions in many inside fires. The metabolic demands of coping with heat 
transfer and fluid balance add to the existing demands of physical exertion. A 
major issue is the combined effect of the accumulation of internally generated 
heat during strenuous exercise and the external heat during fire conditions 
(38, 109). 


Fitness and Performance Capacity 


Numerous studies have evaluated the physiological characteristics of 
firefighters, usually in the context of other studies to determine the response 
to firefighting-related demands. Studies of the fitness of firefighters have 
shown fairly consistently that most firefighters are as or somewhat more fit 
than the general adult male population. However, they are not fit to an 
athletically trained level (17, 25, 57, 86). Fitness and health maintenance 
programs have been developed for firefighters, but have not been con- 
vincingly evaluated for their effectiveness. 

The entrance of women applicants into firefighting caused a reevaluation of 
performance tests and studies comparing the sexes. Misner et al examined the 
performance of 37 men and 25 women on nine job-related tasks used as a 
screening battery in Chicago. The subjects were recruited from among 
athletes in training and individuals known to be highly physically fit. The 
intent was to compare the performance of suitably trained individuals who 
could achieve their potential maximum performance, rather than the assess- 
ment of typical applicants. They found that women demonstrated lower scores 
on average than men in all performance items, but that a subgroup of women 
performed nearly as well in some tasks. The overall difference in performance 
was primarily attributed to lower absolute lean body weight, which correlated 
most strongly and consistently with performance difference (76). The most 
difficult items for women were the stair-climbing exercises. Leg strength 
appears to be predicted by lean body weight, but not by other anthropometric 
measurements (75). 

Given the potential for heat stress, toxic exposure, and hypoxia, Evanoff & 
Rosenstock (32) have suggested that women firefighters who are pregnant 
should cease firefighting activity sometime during the second trimester, and 
that contract policies facilitate pregnancy leave and temporary reassignment. 


CONCLUDING REMARKS 


The demands and harzards of firefighting have changed over the past decades 
(1, 4, 26, 34, 49, 90, 124), but the high quality and standard of service have 
remained the same (105). The use of highly sophisticated firefighting equip- 
ment and the introduction of innovative firefighting techniques, safer personal 
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protective equipment (60), and better communications and information sys- 
tems, as well as healthier life-styles (25, 40, 67), have helped meet public 
demands for service and, at the same time, have provided a safer and healthier 
working environment for the firefighter. In spite of these advances, firefight- 
ing continues to be a very hazardous occupation (47, 83, 90, 119). 
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INTRODUCTION 


The possible relationship between human exposure to time-varying magnetic 
fields in the extremely-low-frequency (ELF) range and adverse health effects 
has become a subject of considerable public interest and concern. By defini- 
tion, the ELF band is composed of electromagnetic fields with frequencies 
below 300 Hz and, therefore, encompasses the 50-Hz and 60-Hz frequencies 
used throughout the world for electric power transmission and distribution. 
Numerous reports have appeared in the literature during the past decade that 
claim to link exposure to ELF fields in the home and workplace to an apparent 
elevation in cancer risk. This chapter explores the biological interactions and 
potential human health effects of ELF magnetic fields and summarizes and 
critically evaluates the literature that has given rise to the public and scientific 
debate on this subject. 
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PHYSICAL QUANTITIES, UNITS, AND MEASUREMENT 
TECHNIQUES FOR ELF MAGNETIC FIELDS 


Quantities and Units 


The flux density, B, of a magnetic field is defined in terms of the force F 
exerted on a charge moving with velocity v (the Lorentz force law), F = Q(v 
x B). The term in parentheses is a vector cross-product equal in magnitude to 
lv! |B! sin ©, where © is the angle between v and B. With F in newtons, Q in 
coulombs, and v in m/s, the metric unit for the magnetic flux density B is the 
tesla (T). The units of electric and magnetic field quantities are summarized in 
Table 1. It is important to note from the Lorentz force law that a maximum 
force is exerted on the moving charge Q when v and B are orthogonal, and no 
force is exerted when they are parallel. In addition, a magnetic field exerts no 
force (and, hence, does no work) on a charge that is not moving. 


ELF Magnetic Field Measurements 


A time-varying magnetic field induces a voltage in any electrically conductive 
circuit exposed to it. This fact is commonly used in magnetic field meters, 
which measure the voltage induced by an alternating magnetic field in a 


“search” coil (23, 54). Meters of this type that are sensitive to low-intensity 
fields are commercially available. Miniature magnetic field monitors can be 
worn by individuals for personal exposure measurements. One example is the 
EMDEX personal monitor, which contains an on-board microprocessor for 
logging a time history of field exposures (90). The Institute of Electrical and 
Electronics Engineers, Inc. has established standard methods for the measure- 
ment of time-varying magnetic fields from power line sources (39). 


Table 1 Quantities and units of electric and magnetic fields 








MKS/SI CGS 
Quantity Unit* Unit? Equivalence® 





Electric field intensity volt/meter _statvolt/centimeter 1 V/m = 
(V/m) (statv/cm) 3.33 xX 1075 statv/em 


Magnetic field flux tesla gauss 1T=10G 
density (T) (G) 





*MKS/SI is the International System of metric units (m/kg/s). 

>CGS is the Gaussian system of units (cm/g/s). 

©Units most commonly used to describe magnetic fields in laboratory research and in typical 
human exposure conditions are mT and wT, which are equal to 10 G and 10 mG, respectively. 
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SOURCES OF ELF MAGNETIC FIELDS 


Magnetic fields in the ELF range are present throughout the environment and 
originate from both natural and man-made sources (36, 72). The naturally 
occurring, time-varying fields in the atmosphere have several origins, includ- 
ing diurnally varying fields on the order of 0.03 uT associated with solar and 
lunar influences on ion currents in the upper atmosphere. The largest time- 
varying, atmospheric magnetic fields arise intermittently from intense solar 
activity and thunderstorms and reach intensities on the order of 0.5 xT during 
a large magnetic storm. Superimposed on the magnetic fields associated with 
irregular atmospheric events is a weak ELF field, which results from the 
Schumann resonance phenomenon (36). These fields are generated by light- 
ning discharges and propagate in the resonant atmospheric cavity formed by 
the surface of the earth and the lower boundary of the ionosphere. 

Extremely-iow-frequency magnetic fields originating from man-made 
sources generally have much higher intensities than the naturally occurring 
atmospheric fields and, in some occupational settings, reach levels that 
approach 0.1 T. Two sources of ELF fields that have been the topics of 
considerable public interest are high-voltage transmission lines and land- 
based naval communication systems. The field at ground level beneath a 
765-kV, 60-Hz power line carrying | kA per phase is 15 wT (87). The 
maximum field at ground level associated with the ELF antennae used in 
submarine communications is 14 uT (4). Household appliances operated from 
a 60-Hz line voltage source produce local magnetic fields in their immediate 
vicinity with flux densities as high as 2.5 mT (36). However, the magnetic 
field strength decreases rapidly as a function of distance from the surfaces of 
household devices (28), and the ambient field levels at most locations within a 
household environment are generally less than 0.3 wT (17, 57, 90). The video 
display terminals present in most offices generate local ELF magnetic fields 
with flux densities up to 5 wT, although the typical exposure level at the 
operator’s location is less than 1 wT (93, 107). Several industrial heating 
processes produce ELF magnetic fields of high intensity within the occupa- 
tional environment. For example, based on a survey of electrosteel and 
welding industries in Sweden, Lévsund et al (55) reported that the local fields 
near 50-Hz ladle furnaces reached a level of 8 mT. The authors also measured 
flux densities as high as 0.07 T near induction heating devices that operate in 
the 50-Hz to 10-kHz range. 

Time-varying magnetic fields in the ELF frequency range are also em- 
ployed in medical treatments, including the stimulation of bone fracture 
reunion (8, 113) and the measurement of blood flow rates (62). Magnetic 
resonance imaging also produces time-varying magnetic fields up to several 
T/s as a result of the switching of magnetic field gradients used for the 
localization of nuclei with magnetic moments, such as protons (15, 94). 
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INTERACTIONS OF ELF FIELDS WITH TISSUE 


Extremely-low-frequency magnetic fields induce electrical currents in tissue 
that circulate in loops within planes that are orthogonal to the direction of 
incidence of the field. This relationship between a time-varying magnetic 
field and the circulating electric field that it induces is expressed formally by 
Faraday’s law: 


oB 


=-VXxE 
at 


where V X E is the curl of the electric field vector. A magnetically induced 
electric field gives rise to currents that are predicted from Ohm’s law, J = 
oE, where J is the induced current density, expressed in the MKS/SI system 
in A/m?, and o is the tissue conductivity in S/m [Siemen (S) = ohm™'}. 

The magnitude of the induced current density can be calculated easily from 
Equation | and Ohm’s law for simple geometries. Consider, for example, a 
model of the human body as a uniformly conductive ellipsoid of revolution 
with the major axis, z, parallel to the long axis of the body. If a sinusoidal 
magnetic field with an amplitude, B,, is incident along the z axis, then the 
peak amplitude of the induced current density in a plane defined by the 
orthogonal x and y coordinate axes is given by: 


- 27 {Bo 
a> + b? 


(b* x7 + at ¥y" 


where a and b are the semi-axes of the ellipsoid. The induced currents 
circulate in closed loops within a plane defined by the x and y coordinates 
(orthogonal to the z axis). 

Magnetically induced current densities in erect humans can be modeled to a 
good approximation by using Equation 2. A typical adult man has a height of 
1.7 m, a mass of 70 kg, and a ratio of body width to thickness of about 2. An 
ellipsoid with semi-major axes of 0.85 m, 0.20 m, and 0.10 m has the same 
height, the same width-to-thickness ratio, and a body volume of 7.1 x 10~? 
m°. The maximum electric field and current density induced in this ellipsoidal 
model occur when the magnetic field lines are horizontal and perpendicular to 
the front of the body. By using Equation 2 with a = 0.20 m and b = 0.85 m, 
the calculated value of J is 1.2 fB,o. As an example, for a human with an 
average tissue conductivity of 0.2 S/m in a 60-Hz, 50-yT field (the highest 
value under a high-voltage transmission line), J = 0.7 mA/m?. 

Although the initial physical interaction of time-varying magnetic fields 
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with living systems is the induction of electric currents in tissue, several 
secondary events may occur that involve biochemical and structural altera- 
tions at the cellular and subcellular levels. At present, there is convincing 
evidence that ELF magnetic fields do not produce DNA strand breaks or 
influence the repair of DNA damage caused by other agents (26, 75). There is 
also evidence that ELF fields do not produce cytogenetic alterations and are 
not directly mutagenic (18, 53). However, a growing body of evidence 
indicates that the pericellular currents established by ELF fields can alter ion 
binding to membrane macromolecules and influence ligand-receptor in- 
teractions at the cell surface (e.g. the binding of hormones or other mitogens) 
(2, 97, 102). These changes in membrane properties are envisioned as setting 
up transmembrane signaling events, possibly mediated by Ca*~ or cyclic 
nucleotides, that trigger abnormal biochemical and cell growth states. Recent 
experiments have demonstrated that a magnetically induced electric field of 
0.1 V/m can significantly increase Ca* * uptake in mitogen-activated thymo- 
cytes (112). In other experiments, investigators have observed that exposure 
to pulsed and sinusoidal ELF magnetic fields leads to altered RNA transcrip- 
tion patterns in dipteran salivary gland cells and in cultured human cells (30, 
31, 33, 34, 115). This effect is accompanied by a significant change in the 
spectrum of cellular proteins synthesized by the exposed cells relative to 
control cells (33). The changes observed in RNA transcripts exhibit a strong 
dependence on the amplitude and frequency of the applied ELF magnetic field 
(34, 115). A possible explanation of these observations is that the field alters 
the rate constant for one (or more) of the intermediate sequential reactions that 
are involved in RNA synthesis and degradation (52). The threshold field level 
for producing such effects appears to be on the order of | mV/m or less (32, 
34). 

The findings of several cellular and membrane responses to relatively weak 
ELF fields have raised the question of how these signals compare with the 
thermally generated electrical noise (Nyquist noise) present in cell membranes 
(114). For fields in the ELF range, the minimum signal strength required to 
exceed the Nyquist noise in a cell membrane was estimated to be approx- 
imately 0.1 V/m. This calculation was based on the effective bandwidth, Af, 
of a cell membrane represented as a parallel combination of a resistance, R, 
and a capacitance, C: Af = (4RC) “! However, a considerably lower field 
threshold of approximately 0.1 mV/m was predicted if the membrane re- 
sponse to an applied field occurs only in a narrow band of frequencies (e.g. a 
10-Hz bandwidth), and if the effects of signal-averaging are considered. 
These simple physical concepts add to the plausibility that relatively weak 
ELF fields can produce measurable responses in cellular functions mediated 
by the plasma membrane. However, the above analysis is confined to in- 
dividual cells and does not address the confounding effects of endogenous 








178 TENFORDE 


ELF background fields that are always present in the environment of cells and 
tissues in vivo (10). 

In a recent theoretical paper, Adair (1) has questioned whether biological 
effects could occur in response to the weak ambient fields to which humans 
are routinely exposed in the home or workplace. He argues that the fields 
induced in tissue at the level of individual cells would be too weak to 
overcome the effects of Boltzmann thermal noise or electrical noise in cell 
membranes. This theoretical treatment, however, neglects the considerable 
signal amplification that can occur in large arrays of electrically coupled cells 
in tissue. It also fails to consider nonequilibrium phenomena, such as coop- 
erative transitions (2, 95, 103), through which extremely weak signals could 
exert significant effects on cell membrane properties. 


EFFECTS OF COMBINED STATIC AND ELF MAGNETIC 
FIELDS 


Several experimental studies have provided evidence that the combination of 
a weak static magnetic field, comparable in strength to the geomagnetic field, 
and a time-varying magnetic field in the ELF frequency range can produce 
resonance interactions that influence ion movements through membrane chan- 
nels and other biological phenomena. Five types of experiments have in- 
dicated that certain combinations of static magnetic field flux density and 
time-varying magnetic field frequency can produce alterations in the rate of 
calcium ion release from the surfaces of cells in brain tissue (12); the operant 
behavior of rats in a timing discrimination task (50, 105); calcium-dependent 
diatom mobility (74, 91); calcium ion uptake by human lymphocytes (49); 
and changes in fibroblast proliferation (77). Investigators have suggested that 
the physical mechanism underlying these effects is ion cyclotron resonance 
(22, 46, 47, 60). In this process, a resonant transfer of energy from a 
time-varying magnetic field occurs when its frequency matches the cyclotron 
resonance frequency of an ion moving within a static magnetic field. The 
resonance condition is formally expressed by the equation, 


QB 


27m 


fe 


where f. is the ion cyclotron resonance frequency, Q is the ion charge, and m 
is the ion mass. For the typical range of the geomagnetic field over the surface 
of the earth (30-70 uT), the resonant frequencies of many biologically 
important ions, such as Na*, K*, and Ca*™*, fall within the ELF range. 
Although several experimental results suggest a resonance mechanism 
through which weak static and ELF fields could produce measurable biologic- 
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al effects, the interpretation of this work presents theoretical difficulties. 
There are five major problems with the ion cyclotron resonance theory: the 
collision frequency of ions undergoing cyclotron resonance motion in mem- 
brane channels is required to be orders of magnitude less than the typical 
collision frequency in an aqueous solution at physiological temperatures; the 
interaction energy of the weak static magnetic field with biological ions is 
several orders of magnitude less than the Boltzmann thermal energy, kT (= 
4.28 x 10-7! J at 310 K); the thermally generated electrical noise (Nyquist 
noise) present in ion transport channels that traverse biological membranes 
(114) is several orders of magnitude greater than the electric field induced in 
these channels by the resonant time-varying magnetic field; the radius calcu- 
lated for a stable ion cyclotron orbit under the conditions used in the ex- 
periments cited above is approximately 50 m, which is more than 10 orders of 
magnitude greater than the typical dimensions of an ion channel in a cell 
membrane (79); and for ion motion that is constrained to lie along a prescribed 
path, such as the helical path envisioned by Liboff (46) for ion transport 
through membrane channels, it follows directly from the equation of motion 
for the particle that a static magnetic field cannot influence the ion movement 
and establish a resonance condition (38). The ion cyclotron resonance interac- 
tion is thus limited to unconstrained ion movements through membrane 
channels. All these factors would interfere with the establishment of ion 
cyclotron resonance conditions in combined static and time-varying magnetic 
fields. Obviously, there is a need to refine the theoretical description of this 
phenomenon before it can form a plausible basis for weak field interactions 
with biological membranes. Also, two recent studies have failed to observe an 
effect of combined fields on Na* and Ca*~* transport under ion cyclotron 
resonance conditions (48, 69). 

A recent theoretical model has been proposed in which the resonant effects 
of combined static and ELF fields are visualized as resulting from an effect on 
the vibrational energy levels of an ion, with a resulting effect on its interaction 
with ligand binding sites (45). This theory is physically more plausible than 
the ion cyclotron resonance model, but remains to be tested experimentally. 


BIOLOGICAL EFFECTS OF ELF MAGNETIC FIELDS 


An effect of ELF magnetic fields on humans, which was first described by 
d’ Arsonval (20), is the induction of a flickering illumination within the visual 
field known as magnetophosphenes. This phenomenon occurs as an im- 
mediate response to stimulation by either pulsed or sinusoidal magnetic fields 
with frequencies less than 100 Hz, and the effect is completely reversible with 
no apparent influence on visual acuity. The maximum visual sensitivity to 
sinusoidal magnetic fields has been found at a frequency of 20 Hz in human 
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subjects with normal vision. At this frequency, the threshold magnetic field 
flux density found by Lévsund et al (56) to elicit phosphenes is approximately 
10 mT. The threshold field level increases rapidly as a function of frequency 
below and above 20 Hz. The corresponding time rate of change of the 
sinusoidal field is 1.26 T/s. In other studies, Silny (89) has observed 
thresholds for magnetophosphene perception in human volunteers as low as 5 
mT with 18-Hz sinusoidal fields. In studies with pulsed fields that had a rise 
time of 2 ms and a repetition rate of 15 Hz, the threshold values of dB/dt for 
eliciting phosphenes ranged from 1.3 to 1.9 T/s in five adult subjects (14). A 
trend in the data suggested that the threshold was lower among younger 
subjects. In related studies, investigators also observed that the stimulus 
duration is an important parameter, because pulses of 0.9-ms duration with 
dB/dt = 12 T/s did not evoke phosphenes. 

Several types of experimental evidence indicate that the magnetic field 
interaction that leads to magnetophosphenes occurs in the retina: magne- 
tophosphenes are produced by time-varying magnetic fields applied in the 
region of the eye, and not by fields directed toward the visual cortex in the 
occipital region of the brain (5); pressure on the eyeball abolishes sensitivity 
to magnetophosphenes (5); the threshold magnetic field flux density required 
to elicit magnetophosphenes in human subjects with defects in color vision 
has a different dependence on the field frequency than that observed for 
subjects with normal color vision (56); and in a patient who had both eyes 
removed as the result of severe glaucoma, phosphenes could not be induced 
by time-varying magnetic fields, thereby precluding the possibility that mag- 
netophosphenes can be initiated directly in the visual pathways of the brain 
(56). 

Silny (89) also studied other phenomena related to the sensitivity of the 
visuosensory system to time-varying magnetic fields. In experiments with 
human subjects, distinct flickering could be elicited in the visual field by 
sinusoidal magnetic fields in the frequency range of 5-60 Hz. The threshold 
field intensity varied with the field frequency and background light level, but 
was as low as 5 mT under optimal conditions. Alterations in visually evoked 
potentials (VEP) were also reported to occur in sinusoidal magnetic fields at 
intensity levels that are five to ten times greater than those that produce 
magnetophosphenes (89). The change in VEP was characterized by a reversal 
of polarity and a decreased amplitude of the three major evoked potentials. 
These effects were observed within three minutes after onset of the magnetic 
field exposure, and the VEP returned to normal only after a recovery period of 
approximately 30-70 minutes following termination of the exposure. The 
relationship of these changes in the VEP to the mechanism of magnetophos- 
phene induction is not clear from currently available evidence. 

Time-varying magnetic fields that induce current densities above 1 A/m? in 
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tissue lead to neural excitation and can produce irreversible biological effects, 
such as cardiac fibrillation (76, 102). Several investigators have achieved 
direct neural stimulation by using pulsed or sinusoidal magnetic fields that 
induced tissue current densities in the range of 1-10 A/m*. In one study 
involving electromyographic recordings from the human arm, Polson et al 
(73) found that a pulsed field with dB/dr greater than 10* T/s was required to 
stimulate the median nerve trunk. The duration of the magnetic stimulus is 
also an important parameter in the excitation of nerve and nerve-muscle 
specimens. By using a 20-kHz sinusoidal field applied in bursts of 0.5- to 
50-ms duration, Oberg (68) found that a progressive increase in the magnetic 
flux density was required to stimulate the frog gastrocnemius neuromuscular 
preparation when the burst duration was reduced to less than 2-5 ms. A 
similar rise in threshold stimulus strength has been observed for frog neuro- 
muscular stimulation by using pulsed magnetic fields with pulse durations less 
than approximately 1 ms (109, 110). 

Time-varying magnetic fields that induce tissue current densities less than 
approximately 1-10 mA/m? produce few, if any, irreversible biological 
effects. This general observation is not surprising, as the endogenous current 
densities present in many organs and tissues lie in the range of 0.1 to 10 
mA/m7, as discussed by Bernhardt (10). In contrast, time-varying magnetic 
fields that induce peak current densities greater than approximately 10 mA/m? 
reportedly produce various alterations in the biochemistry and physiology of 
cells and organized tissues. One example is the effect of the bidirectional 
pulsed fields used to facilitate bone fracture reunion in humans (8). Numerous 
laboratory investigations have also led to reports of a broad spectrum of 
alterations in cellular, tissue, and animal systems in which current densities 
exceeding 1-10 mA/m? were induced by ELF magnetic fields (96-100). 
These effects include altered cell growth rate; decreased rate of cellular 
respiration; altered metabolism of carbohydrates, proteins, and nucleic acids; 
effects on gene expression and genetic regulation of cell functions; ter- 
atological and developmental effects; morphological and other nonspecific 
tissue changes in animals, frequently reversible with time following exposure; 
endocrine alterations, including suppression of the nocturnal level of pineal 
melatonin; altered hormonal responses of cells and tissues, including effects 
on cell-surface receptors; and altered immune response to antigenic stimula- 
tion. 

In assessing these reported effects of time-varying magnetic fields, it is 
important to recognize that very few of the observations have been in- 
dependently replicated in a second laboratory. In many cases, in which 
attempts at replication were carried out, the results were contradictory. One 
notable example of this variability is the attempt by several groups of in- 
vestigators to determine whether teratological effects result from the exposure 
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of chicken embryos to pulsed magnetic fields of low intensity, as originally 
reported by Delgado et al (21). Widely divergent results, ranging from no 
effects to significant effects on embryo development, were obtained in these 
experiments (97). To resolve the question of whether Delgado et al’s (21) 
original experimental results are replicable under controlled laboratory con- 
ditions, the Office of Naval Research and the US Environmental Protection 
Agency (EPA) recently sponsored an international cooperative effort, which 
involved six independent laboratories (9). Laboratories located in Spain, 
Sweden, Canada, and the US were equipped with identical pulsed magnetic- 
field exposure systems that had been constructed and tested by the same 
engineering team. The pulse parameters chosen for this study were 100-Hz 
repetition frequency, 500-ys pulse duration, 2-ys pulse rise and fall times, 
and peak magnetic flux density of 1.0 wT. Each of the six laboratories 
conducted ten separate experiments with 20 chick eggs, ten of which were 
exposed to the pulsed field for the first 48 hours of incubation; the remaining 
ten eggs were sham-exposed for the same time interval. Two of the six 
laboratories observed a statistically significant increase in the proportion of 
abnormal embryos (p < 0.001 and p = 0.03), whereas the other four 
laboratories did not observe a significant difference between the exposed and 
sham-exposed embryos. The overall data from the six different laboratories, 
however, did show a statistically significant increase in the proportion of 
abnormal embryos in the exposed groups of eggs. The interlaboratory var- 
iations observed in this series of experiments are indicative of the difficulty 
encountered in replicating the results of biological studies on ELF field 
effects, even when exceptional efforts are made to control the relevant 
experimental variables. 


HUMAN HEALTH STUDIES 


Many studies on human responses to ELF magnetic fields have been reported, 
primarily epidemiological studies on adverse reproductive outcomes and 
elevated cancer risk in more highly exposed groups of individuals (104, 111). 
The conclusions of many of these studies have been difficult to interpret. This 
problem has largely arisen because of a lack of information on the biologically 
effective exposure parameters for ELF fields, and because of the failure of 
many investigators to account for confounding variables. Only some high- 
lights of the human studies are summarized here, as extensive reviews are 
available (64, 97, 104, 111). 


Laboratory Investigations 


Several studies have been made of the general health profiles of individuals 
who work in electrical occupations or who were exposed to ELF magnetic 
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fields under controlled laboratory conditions. Medical examinations of 379 
workers in electrical substations in Italy revealed no adverse clinical sym- 
ptoms relative to a control group of 133 workers (7). Laboratory studies on 
humans exposed to ELF magnetic fields have also failed to reveal any adverse 
physiological or psychological symptoms in the exposed subjects. The strong- 
est field used in these experiments was a 5-mT, 50-Hz field to which subjects 
were exposed for four hours by Sander et al (78). No field-associated changes 
were observed in serum chemistry, blood cell counts, blood gases and lactate 
concentration, electrocardiogram, pulse rate, skin temperature, circulating 
hormones (cortisol, insulin, gastrin, thyroxin), and various neuronal measure- 
ments, including visually evoked potentials recorded in the electroencephalo- 
gram. Graham et al (35) observed small changes in heart rate and motor 
responses in extensive studies on human subjects exposed to 60-Hz electric 
and magnetic fields with intensities comparable to those of fields in the 
vicinity of high-voltage transmission lines. These effects were reversible 
following termination of the exposure. Several other physiological and 
biochemical indices that were examined in this controlled human study did 
not exhibit significant changes during exposure to 60-Hz fields with intensit- 
ies up to 12 kV/m and 30 uT. 


Electric Blankets and Electrically Heated Beds 


A study by Wertheimer & Leeper (116) led to evidence of seasonal changes in 
fetal growth and in abortion rate among women who used electrically heated 
beds during the winter months. The authors contended that these adverse 
effects on fetal development could result from exposure to the 60-Hz 
electromagnetic fields present at the surfaces of electrically heated beds. They 
pointed out, however, that the potentially harmful effect of excessive heat on 
fetal growth cannot be excluded on the basis of their data. In a more recent 
study on magnetic field exposures in the home and cancer incidence in 
children, Savitz et al (84) concluded that the use of electric blankets was 
weakly associated with childhood cancer. However, the elevated odds ratio of 
1.3 was not statistically significant. Another study found no elevation in the 
risk of acute myelogenous leukemia as a result of electric blanket use (71). A 
recent report indicated that the nighttime urinary excretion of a melatonin 
metabolite, 6-hydroxymelatonin sulfate, is altered in human subjects who use 
electric blankets (120). The possible influence of this endocrine effect on 
human health remains to be determined. Many US manufacturers have recent- 
ly introduced design changes in the wiring of electric blankets that lead to 
nearly complete cancellation of the surface 60-Hz magnetic fields. Blankets 
that are operated with a direct current supply (i.e. not varying with time) have 
also been produced. 
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Video Display Terminals 


Goldhaber et al (29) have reported that the rate of miscarriages increased by 
approximately 80% among women who worked on video display terminals 
for more than 20 hours per week, as compared with women who did similar 
work without the use of these devices. No statistically significant risk of 
miscarriage was found among women who worked at video display terminals 
for less than 20 hours per week. Overall, the increase in rate of birth defects 
was about 40% for women who worked at video display terminals for more 
than five hours per week, but this increase was not statistically significant. 
Although the results of this study suggest that the fields from video display 
terminals may enhance the risk of miscarriage, the possible role of job stress 
or other unidentified factors cannot be excluded on the basis of the available 
information. In distinct contrast to the results of the Goldhaber et al (29) 
study, nine other epidemiological surveys have not obtained evidence for a 
significant elevation in spontaneous abortion rate or birth defects as the result 
of prolonged exposure during pregnancy to the electromagnetic fields from 
video display terminals (13, 16, 24, 25, 44, 58, 66, 67, 86). The latest of the 
studies with negative outcomes was a comprehensive analysis involving more 
than 2000 directory-assistance telephone operators who use video display 
terminals throughout the workday (86). 


Residential Fields and Cancer Risk 


One of the most controversial issues related to the interaction of 
electromagnetic fields with humans is the reported link between residential 
and occupational exposure to ELF fields and cancer risk. The first report on 
this subject was published by Wertheimer & Leeper (119), who found that 
cancer deaths (primarily leukemia and nervous system tumors) in children less 
than 19 years of age in the Denver, Colorado, area were correlated with the 
presence of high-current primary and secondary wiring configurations near 
their residences. This retrospective epidemiological study was based on 344 
fatal childhood cancer cases from 1950 to 1973 and an equal number of 
age-matched controls chosen from birth records. The electrical power lines 
near the birth and death residences of the cancer cases and the residences of 
the controls were inspected and classified as being either high-current con- 
figurations (HCC) or low-current configurations (LCC), which are assumed to 
reflect the local intensity of the 60-Hz magnetic fields within the homes of the 
subjects. The percentage of the cancer cases whose birth and death residences 
were near HCC was significantly greater than for the control subjects, from 
which the authors concluded that an association may exist between the 
strength of magnetic fields from the residential power-distribution lines and 
the frequency of childhood cancer. In a subsequent publication, these authors 
reported that a similar association exists for the incidence of adult cancer 
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(117). This later study was based on 1179 cancer cases (78% fatal) in Denver, 
Boulder, and Longmont, Colorado, from 1967 to 1977. 

Based on direct measurements of 60-Hz electric and magnetic fields in 434 
homes in the Denver metropolitan area, Barnes et al (6) concluded that the 
magnetic field component was weakly correlated with the power line wiring 
code used by Wertheimer & Leeper (117, 119). Kaune et al (43) obtained a 
similar result in studies of homes in three counties in Washington state. The 
60-Hz electric field component of electromagnetic fields measured within the 
homes was not correlated with the power line wiring code. This finding was 
not unexpected, as the electric fields emanating from power line sources are 
attenuated by trees, the walls of homes, and other objects. In contrast, the 
magnetic fields emanating from power lines are not influenced by materials 
that lack a significant amount of iron or other magnetic materials (e.g. trees or 
the walls of homes). Emphasis has therefore been placed on determining 
whether an association exists between cancer risk and exposure to the magnet- 
ic field component of ELF fields. 

Following Wertheimer & Leeper’s initial report on childhood cancer, five 
other epidemiological studies have been conducted to determine whether a 
relationship exists between residential magnetic fields from power line 
sources and the incidence of cancer in children. In the first of these studies, 
Fulton et al (27) used methodology that was matched as closely as possible to 
that of Wertheimer & Leeper, including the designation of HCC and LCC 
power lines. This study involved 119 leukemia patients with ages of onset 
from 0 to 20 years, whose address histories were obtained from medical 
records at Rhode Island Hospital, and 240 control subjects chosen from 
Rhode Island birth certificates. Fulton et al (27) concluded that no statistically 
significant correlation existed between the incidence of leukemia and the 
residential power line configurations. Wertheimer & Leeper (118) were critic- 
al of this study because the case and control groups had not been matched for 
interstate migration, for years of occupancy at residences, or for the ages of 
the children at the time their residential addresses were determined from birth 
records and hospital medical records. Reevaluating the data, Wertheimer & 
Leeper (118) excluded cases and controls aged 8 and older, which allowed 
them to define a complete residential history for the remaining subjects (53 
cases and 71 controls). In this subset of the total population studied by Fulton 
et al (27), Wertheimer & Leeper found a weakly significant correlation (p = 
0.05) between the incidence of leukemia and residential HCC power lines. 

Another study of childhood cancer incidence was conducted in the county 
of Stockholm by Tomenius (108), who analyzed the residential 50-Hz 
magnetic fields for 716 cases that had a stable address from the time of birth 
to the time of cancer diagnosis, and for 716 controls who were matched for 
age, sex, and birth location. An evaluation was made of the electrical wiring 
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configurations near the residences of the study population, and measurements 
were made of the magnetic-field flux density in the frequency range above 30 
Hz at the entrance door to each residence. Among the residences within 150 m 
of 200 kV power lines, a statistically significant elevation was found in the 
incidence of cancer. The most frequently observed types of cancer were 
nervous system tumors and leukemia. There was, however, an inconsistency 
in the results of this study insofar as the cancer risk was greater in homes with 
magnetic field levels at the entrance less than 0.3 wT relative to homes in 
which the field level was more than 0.3 pT. 

In contrast to the findings of Wertheimer & Leeper (119) and Tomenius 
(108), Myers et al (63) found no relationship between the risk of childhood 
cancer and residential proximity to overhead power lines. This study was 
conducted in the Yorkshire Health Region in England, and included 376 
cancer cases diagnosed in children less than 15 years of age from 1970 to 
1979. There were 590 age-matched controls in the study. Magnetic fields at 
the birth addresses were calculated on the basis of data from the electrical load 
records for the overhead lines. The results showed no significant elevation in 
the cancer risk ratio with increasing field strength, and no dependence of the 
risk ratio on distance from the overhead lines. 

A case-control epidemiological study by Savitz et al (85) attempted to 
verify the initial findings of Wertheimer & Leeper (119) on childhood cancer 
in the Denver area. This study involved 357 cancer cases diagnosed between 
1976 and 1983. The cancer incidence data were analyzed on the basis of both 
the Wertheimer/Leeper wiring code and spot measurements of 60-Hz magnet- 
ic fields in the homes. A correlation between cancer risk in children less than 
14 years of age and the proximity of their residences to high-current wiring 
configurations was found. However, the authors observed no statistically 
significant association between the measured household fields and childhood 
cancer incidence. The results of this study are, therefore, ambiguous and 
suggest that the wire code developed by Wertheimer & Leeper may be an 
indicator of some unidentified carcinogenic factor, or factors, in the urban 
environment. This possibility is suggested by a recently completed study in 
Los Angeles (S. J. London et al 1991, unpublished) that led to results similar 
to those obtained in the Savitz et al (85) study. 

Two other epidemiological surveys have failed to detect an association 
between residential exposure to power-frequency fields and cancer risk. In 
England, McDowall (59) found no correlation between cancer mortality and 
residential exposure to the fields from electrical utility installations (substa- 
tions and overhead power lines). This study involved a retrospective analysis 
of mortality from 1971 to 1983 among a population of 7631 persons in East 
Anglia who were identified as living near electrical installations. The stan- 
dardized mortality ratios for this large study population were lower than 
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expected for three major causes of death: cancer, cardiovascular disease, and 
respiratory disease. The results of this study, therefore, did not support other 
claims of an elevated cancer risk associated with residential exposure to 
power-frequency fields. 

A case-control study of the incidence of acute nonlymphocytic leukemia 
(ANL) in three counties in Washington state also failed to find a correlation 
between residential exposure to 60-Hz fields and cancer risk (88). For 164 
cases of ANL and 204 controls from the same geographic area, residential 
wiring codes were analyzed by the Wertheimer/Leeper technique and direct 
measurements were made of the residential electric and magnetic fields (43). 
Several confounding variables, such as smoking habits and socioeconomic 
status of the case and control subjects, were analyzed. The overall results 
provided no evidence for a possible association between residential exposure 
to 60-Hz fields and the risk of ANL. 


Occupational Exposure and Cancer Risk 


The controversy surrounding the issue of exposure to ELF fields and cancer 
risk has been increased by numerous epidemiological reports published since 
1982, in which an apparent association was found between employment in 
various electrical occupations and cancer risk (primarily leukemia and tumors 
of the nervous system). Many of these studies have been reviewed previously 


(19, 80, 97). Savitz & Calle (82) have attempted to collate the data from 11 of 
these published studies to estimate the average relative risk of all leukemias, 
acute leukemias, and acute myelogenous leukemias among workers in 12 
different classes of electrical occupations. The overall relative risk and 95% 
confidence intervals for leukemia mortality were the following: total leuke- 
mias, 1.2 (1.1—1.3); acute leukemias, 1.4 (1.2—1.6); and acute myelogenous 
leukemias, 1.5 (1.2—1.8). Savitz & Calle (82) concluded that a correlation 
exists between employment in electrical occupations and leukemia risk. 
However, they pointed out that none of the epidemiological surveys con- 
ducted thus far has established that exposure to 60-Hz electromagnetic fields 
is the causal factor that leads to an elevated cancer risk among electrical 
workers. Similar results and conclusions were obtained by Coleman & Beral 
(19) in a meta-analysis of data on leukemia risk among workers in electrical 
occupations. 

Four separate studies have reported an increased risk of brain tumors 
among workers in electrical occupations (51, 61, 92, 106). These epidemiolo- 
gical studies were based on data obtained from death certificates of workers in 
electrical occupations in different geographic areas within the US. Although 
the findings were reasonably consistent, none of these studies established a 
true causal relationship between exposure to 60-Hz fields and the risk of brain 
malignancies. In addition, the potential contribution to cancer risk of expo- 
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sure to other agents, such as organic solvents, was not assessed in any of the 
epidemiological studies on brain tumor incidence among electrical workers. 


Critique of Epidemiological Studies 


Overall, the epidemiological studies on the possible correlation between 
cancer risk and residential exposure to electromagnetic fields do not support 
the conclusion of a strong association. In the earlier studies on this subject, 
especially those conducted by Wertheimer & Leeper (117, 119) in the Denver 
area, the control groups were chosen in a nonblind manner. In addition, 
quantitative measurements of the 60-Hz fields within the residences of the 
case and control subjects have been made only in the studies by Savitz et al 
(85) and Severson et al (88), and in the above-mentioned study by London et 
al. As discussed above, magnetic field levels measured in homes have shown 
a much weaker association with cancer risk than have the Wertheimer/Leeper 
wire codes. Finally, with the exception of these two studies, no attempt was 
made to analyze the role of confounding variables in the overall cancer risk of 
the case and control populations. Savitz & Feingold (83) have found that 
residential traffic density is strongly associated with childhood cancer, es- 
pecially leukemia, among the same study population that was previously 
reported to have an association between cancer risk and power line con- 
figurations (85). They speculated that benzene, a known leukemogen, in 
automobile exhaust fumes may have been a contributing factor in the elevated 
incidence of childhood cancer. Savitz & Baron (81) have emphasized the 
importance of estimating and correcting for confounding variables in 
epidemiological studies. In the case of ELF fields, several large studies are 
currently underway in the US and other nations that are attempting to identify 
possible confounding variables (40, 111). 

In view of limitations in the epidemiological studies conducted to date, it is 
not possible to conclude that a definite association exists between the expo- 
sure of individuals to ELF fields and their relative risk of contracting leukemia 
or other forms of cancer. The available evidence suggests that workers in 
electrical occupations have an increased risk of cancer, primarily leukemia 
and nervous tissue tumors. Similarly, individuals living in homes near high- 
current configurations of power distribution lines may have an elevated cancer 
risk, although the available data are not convincing. In a recent literature 
review (111), staff members at the EPA indicated that “with our current 
understanding, we can identify 60-Hz magnetic fields from power lines and 
perhaps other sources in the home as a possible, but not proven, cause of 
cancer in humans.” The EPA report stated further that, in spite of method- 
ological weaknesses, “the occupational studies tend to support the results of 
the childhood cancer studies, and excesses occur at the same sites.” There 
exists a clear need for additional epidemiological surveys on large populations 
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of subjects, in which efforts are made to analyze the possible role of con- 
founding variables and to conduct proper dosimetry measurements for expo- 
sure assessment. Approximately 20 epidemiological studies on cancer risk in 
relation to power-frequency electromagnetic field exposure are currently 
under way in Europe, Australia, and North America (40, 111). 


EFFECTS OF ELF MAGNETIC FIELDS ON CARDIAC 
PACEMAKERS 


Extremely-low-frequency magnetic and electric fields can produce 
electromagnetic interference (EMI) in implanted medical electronic devices, 
such as cardiac pacemakers (37, 101). The unipolar design of demand cardiac 
pacemakers, in which the cathode lead is implanted in the heart and the 
pacemaker case serves as the anode, is particularly susceptible to low- 
frequency EMI. In experimental studies with a magnetic resonance imaging 
system, Pavlicek et al (70) found that a rapidly-switched gradient field with a 
time variation of 3 T/s can induce potentials up to 20 mV in the loop formed 
by the electrode lead and the case of a unipolar pacemaker. Jenkins & Woody 
(42) examined 26 pacemaker models for sensitivity to 60-Hz magnetic fields. 
Twenty of these units reverted to an asynchronous mode or exhibited abnor- 
mal pacing characteristics in 60-Hz fields with amplitudes ranging from 0.1 to 


0.4 mT. The average threshold flux density for producing pacemaker 
malfunction was 0.2 mT. This level is high relative to common human 
exposures, but is less than the magnetic flux densities near the surfaces of 
many appliances and tools (28). 


EXPOSURE GUIDELINES FOR ELF MAGNETIC FIELDS 


Although many states in the US have established limits on human exposure to 
60-Hz fields in the vicinity of high-voltage transmission lines, there are 
currently no federal regulations on public or occupational exposures to fields 
in the ELF range. However, guidelines for exposure to fields in this frequency 
range have been established in West Germany and the United Kingdom. In 
the West German guidelines (11), the exposure limit for magnetic fields was 
set at the level of 20 mT for frequencies below 2 Hz. At frequencies from 2 to 
10,000 Hz, the exposure limit was set in accord with the formula B (rms) = 
27.135/f° 2 mT, which gives a limit of 5.0 mT at 50 Hz. In the United 
Kingdom (65), the time-varying magnetic field limit recommended by the 
National Radiological Protection Board (NRPB) for occupational exposures 
was set at 10 mT for frequencies below 10 Hz. At frequencies in the range of 
10 to 750 Hz, the occupational exposure limit was set in accord with the 
formula B (rms) = 94/f mT, which gives a limit of 1.88 mT at 50 Hz. At 
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frequencies from 750 to 50,000 Hz, the exposure limit was set at 0.125 mT. 
The guideline also states that occupational personnel should not be exposed to 
the maximum permissible field levels for more than two hours per day. 
Significantly lower exposure limits were established in the NRPB guidelines 
for the general public. 

In 1990, the International Non-lonizing Radiation Committee (INIRC) of 
the International Radiation Protection Association (IRPA) recommended a set 
of exposure limits on 50/60 Hz electric and magnetic fields (41). The limit on 
magnetic flux density for occupational exposures was set at 0.5 mT for the 
entire workday, 5 mT for exposures of less than two hours duration, and 25 
mT for exposure of the limbs throughout the workday. The IRPA/INIRC 
exposure limits for the general public were set at 0.1 mT for continuous 
exposures and 1.0 mT for exposures during periods of a “few hours per day.” 

In 1991, the American Conference of Governmental Industrial Hygienists 
(ACGIH) proposed guidelines for occupational exposure to time-varying 
magnetic fields with frequencies in the range of 1 Hz to 30 kHz (3). The 
maximum exposure level at 60 Hz was set at | mT, which limits the 
maximum induced current density within the body to a root-mean-square 
value of 10 mA/m7?. The rationale for the ACGIH guidelines was that, apart 
from the controversial issue of cancer risk in relation to occupational or 
residential exposure to 60-Hz fields, there is no strong evidence for harmful 
effects of ELF magnetic fields that induce current densities in the body of 10 


mA/m* or less. The ACGIH also recommends that personnel wearing cardiac 
pacemakers should not be exposed to 60-Hz magnetic fields above 0.1 mT to 
avoid possible problems with electromagnetic interference. 


SUMMARY AND CONCLUSIONS 


Various different effects of ELF magnetic fields have been reported to occur 
at the cellular, tissue, and animal levels. Certain effects, such as the induction 
of magnetophosphenes in the visual system, have been established through 
replication in several laboratories. Many other effects, however, have not 
been independently verified or, in some cases, replication efforts have led to 
conflicting results. A substantial amount of experimental evidence indicates 
that the effects of ELF magnetic fields on cellular biochemistry, structure, and 
function can be related to the induced current density, with a majority of the 
reported effects occurring at current density levels in excess of 10 mA/m’. 
These effects, therefore, occur at induced current-density levels that exceed 
the endogenous currents normally present in living tissues. From this perspec- 
tive, it is extremely difficult to interpret the results of recent epidemiological 
studies that have reported a correlation between cancer incidence and expo- 
sure to 50-Hz or 60-Hz magnetic fields with very low flux densities. The 
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levels of current density induced in tissue by occupational or residential 
exposure to these fields are, in nearly all circumstances, significantly lower 
than the levels found in laboratory studies to produce measurable per- 
turbations in biological functions. There is a clear need for additional 
epidemiological research to clarify whether exposure to ELF magnetic fields 
is, in fact, causally linked to cancer risk. Laboratory animal studies conducted 
under controlled conditions are also needed to determine whether ELF mag- 
netic fields can initiate or promote tumors. In addition, more studies of both a 
theoretical and experimental nature are needed to elucidate the molecular and 
cellular mechanisms through which low-intensity magnetic fields can in- 
fluence living systems. A growing body of evidence indicates that cell 
membranes play a key role in the transduction and amplification of ELF field 
signals. Elucidation of the physical and biochemical pathways that mediate 
these transmembrane signaling events will represent a major advance in our 
understanding of the molecular basis of magnetic field effects on biological 
systems. 
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INTRODUCTION 


Drug testing in the workplace presents a striking case of a policy instrument 
that has penetrated fast and far, accompanied by almost no credible scientific 
warrant of effectiveness. In just a decade, worksite drug testing programs 
have made their way through the entire military and all other federal employ- 
ment, into many of the nation’s largest and most prestigious private corpora- 
tions, and all the way to the United States Supreme Court. Meanwhile, solid 
research has focused almost entirely on the efficacy of the testing procedures 
themselves, to the virtual exclusion of deeper questions of sound social 
policy. The most fundamental assumptions on competing sides of unresolved 
debates over the merits and demerits of screening workers’ urine for traces of 
drugs are untested and untestable, knowing what we currently know. In fact, 
the gap between science and policy is so wide as to occasion real confusion 
about which goals are really being served by drug testing programs at work. 
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Goals and Types of Testing Programs 


Ostensibly, employers test workers’ urine for drugs to ensure the safety and 
productivity of their own labor force and to reduce the likelihood that their 
employees will report to work impaired by drugs, will use drugs while at 
work, will buy and sell them there, or will engage in other drug-related 
activities that might endanger themselves or others. Employers are also 
concerned that drug use will result in time away from work, distress cowork- 
ers or customers, or intrude in any other way on the safe and satisfactory 
conduct of whatever commerce the work entails. 

As a matter of public policy, government encourages, facilitates, and 
partially subsidizes private-sector drug testing programs to reduce national 
consumption and traffic in illicit drugs, and the negative social consequences 
of these problems. Worksite testing presumably creates a sentinel effect and 
aids in the identification of drug abusers who could perhaps be helped by 
effective treatment. To what extent these objectives are actually being served 
has yet to be asked, except superficially. 

Testing is generally conducted under three broad rubrics: at pre- 
employment; “for cause”; and randomly, without cause or suspicion. Each 
situation has its own supporting logic and set of complexities. Employers 
have long conducted preemployment or preplacement tests of various kinds to 
establish fitness for duty (70). Drug testing continues in this tradition. A 
succession of state and federal handicap laws and regulations has narrowed 
the employer’s discretion in hiring, but obligations to job applicants are still 
less compelling and constraining than those to employees on the payroll. 
Preemployment testing is, therefore, the most common and least contested 
form of drug testing at work. 

Testing “for cause” extends an established tradition of investigating injuries 
and other incidents. Sometimes, such testing monitors compliance with a 
mandatory program of rehabilitation for a substance-abusing employee whose 
unsatisfactory performance has prompted a referral to an employee assistance 
program (EAP) (70). The purported link with safety, in a postincident in- 
vestigation and with both safety and therapeutic intent in EAPs, renders this 
class of testing relatively less controversial and more common than testing 
without grounds for suspicion. However, the objection is raised that drug 
testing may preempt other accident investigation, which could lead to the 
abatement of hazards, and that the link between safety risk and a positive 
urine test is far from clear. Grievance processes provide some formal protec- 
tion for union-represented employees. How frequently this testing results in 
employee dismissal, or in successful rehabilitation, is a question for future 
research. 

Periodic, unannounced testing of employees picked at random, either from 
a sample of workers who perform “safety-sensitive” tasks or from a larger 
pool (extending occasionally to a firm’s total labor force), is the most novel 
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and least common form of worksite drug testing—and the form that civil 
libertarians find most objectionable. The further a random testing program 
strays from a convincing safety rationale, and the less attention it pays to 
procedural safeguards of the employee’s perceived right to privacy and fair 
play, the wider it is open to critique. 

From a public health perspective, the overriding question is whether and, if 
so, how drug testing programs prevent or postpone any death, disease, 
disability, or dysfunction associated with psychoactive drugs. Here, several 
important assumptions underlying worksite drug testing bear examination. 
The first is that a sufficiently serious problem exists to justify a response that 
may violate deeply felt norms. The second is that the response produces more 
benefits than harm. The third is that preferable solutions are unavailable or 
unaffordable. 


Coverage of this Review and of the Available Literature 


In this review, we seek to clarify what is and is not known about drug testing, 
by drawing on a sizeable descriptive and prescriptive literature that emanates 
chiefly from the National Institute on Drug Abuse (NIDA), as well as on other 
practitioner-directed sources.' In addition, the literature includes extensive 
discussion, often in special issues of law reviews, of ethical and legal 
ramifications, many of which are cited below. 

The current literature lacks studies that would provide empirical grounding 
for the assumptions enumerated above: convincing studies documenting that 
drug abuse at work is, in fact, a serious threat (and, specifically, where and 
how the harms are manifest); evaluations demonstrating the effectiveness of 
drug testing strategies; and studies comparing drug testing with alternative 
policies and programs aimed at reducing worksite inefficiencies or malfunc- 
tions related to drug abuse, such as EAPs, or comparing EAPs with and 
without a drug testing component. 

Several descriptive studies have discussed results of drug screening pro- 
grams, but without adequate designs to test hypotheses regarding effective- 
ness (12, 48, 52, 64). Three electronic literature searches of bibliographic 
data bases in medicine, sociology, psychology, health administration, and 
government uncovered only six studies designed to test the hypothesis that 
drug screening is an effective strategy to ensure the safety and/or productive 
capacity of any employee group (5, 16, 50, 54, 61, 77). Three of these studies 
were published in peer-reviewed journals, and one evaluated costs versus 
benefits (50). Of the remaining three studies, two also analyzed costs and 


‘Course materials from the American College of Occupational Medicine, newsletters, loose- 
leaf services and reports from information management firms, such as the Bureau of National 
Affairs, Business Research Publications, and the Conference Board, as well as a few reports from 
the popular press and some unpublished results of opinion polls. 
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benefits (16, 61), and one of these (16) appeared in an NIDA monograph, 
thus not subject to peer review. 

If the shortage of evaluative research is a problem, the lack of information 
on rudimentary questions—beginning with the extent and distribution of 
different types of drug testing—is more handicapping still. Many surveys 
have been done, but few well enough to inspire confidence in their validity; 
we cite the better ones, with caveats. Also, although debates regarding drug 
testing as a policy vehicle rely on the support (or opposition) of public 
opinion, most summary statistics withstand little scientific scrutiny, a point 
we elaborate. 

Before reviewing the scientific literature on the extent of drug abuse in the 
workplace, the effectiveness of testing programs, and the feasibility of 
alternative approaches, we set the context with a brief overview of the 
emergence and apparently rapid diffusion of this technology, including the 
evolving legal doctrine and regulatory directive that shape its application. 


EVOLUTION OF DRUG TESTING AS PUBLIC POLICY 


Technological advances made drug testing practical on a mass scale, and 
much subsequent thinking has centered on “technique” (21). The seeds were 
sown some 16 years ago, when NIDA began supporting research on problems 
associated with drugs, including a line of work on methods to detect drug 
traces in urine and other bodily fluids. Originally, the technology was pri- 
marily used to monitor heroin and methadone use in drug treatment centers. 
Patients were forewarned that they would have to submit to drug testing as a 
condition of participation. In 1978, NIDA entered into a cooperative agree- 
ment with Syntex Corporation to develop a relatively inexpensive, rapid assay 
to detect marijuana use. Similar assays for amphetamine and opiate use were 
already available. By the end of 1981, new, portable assay machines were 
brought to market, under the trade name EMIT (enzyme-multiplied im- 
munoassay test). Hoffman-LaRoche Company also came to market in the 
early 1980s with a test kit (“Abuscreen”) that used radioactive immunoassay. 

At about this time, Congress was pressing the Department of Defense to do 
something about drug abuse in the military, which was perceived to be a 
growing problem that could compromise national security. A Department of 
Defense survey, conducted in November 1980, found that 47% of Navy and 
Marine personnel, age 25 or under, reported having used marijuana, and 26% 
reported having been under the influence of drugs while on duty (10). The 
following May, a Navy plane crashed off the Florida Atlantic coast into the 
aircraft carrier US Nimitz, injuring 42 sailors and killing 14, six of whom were 
found on autopsy to have had traces of marijuana in their blood (10, 39). Soon 
thereafter, Naval Admiral Hayward announced a policy of “zero tolerance” of 
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drugs, and instituted a Navy-wide testing program. By the end of 1982, the 
Navy had portable testing machines in most ships in the fleet; more than 2 
million urine specimens are now tested annually by the US Navy (10). 

Although the Navy’s program was and is the most extensive in the armed 
forces, all branches were testing for drugs by 1981. The military crack-down 
attracted media attention, which, in turn, began to stimulate the interest of 
private-sector employers. IBM Corporation announced a job applicant drug 
testing program in 1984 (57), and American Airlines and Alabama Power 
Company soon followed suit. By 1986, many large private-sector firms were 
testing job applicants, as well as some current employees, for traces of drugs. 
Typically, drug screening in these large firms was placed in the context and 
under the aegis of long-established programs of occupational health sur- 
veillance, employee assistance, and/or health promotion, such as those at E. 
I. DuPont de Nemours, the General Electric Company, Eastman Kodak 
Company, and Exxon Corporation. 


Federal Regulatory Initiatives 


Rather than await spontaneous diffusion into the private sector, however, the 
government continued to promote urine screening through regulatory initia- 
tives. At first, the program only covered federal employees; later, it was 
expanded to encompass private-sector employees engaged in specified types 
of governmental work. Executive Order 12564 (56) was a watershed. Signed 
on September 15, 1986, by President Ronald Reagan, the order required each 
executive agency to establish a program to test for use of illegal drugs by 
federal employees in “sensitive positions” (broadly defined),” and to offer 
voluntary testing. The order also authorized testing for cause, as follow-up to 
counseling or rehabilitation, and at preemployment. 

The immediate stimulus for Executive Order 12564 was the final report of 
the President’s Commission on Organized Crime in March 1986 (24), which 
recommended mandatory drug testing as part of an overall strategy shift. 
Resources would be diverted from the “supply-side” effort to a “demand- 
reduction” approach aimed at drying up the market in illicit drugs (24). The 
agencies were left latitude to shape their own programs’ requirements and 
circumstances for testing, but overarching rules of procedure were specified 
in the order. 

In 1987, Congress appropriated funds to implement the executive order (9), 
and the Department of Health and Human Services (DHHS) drew up technical 
and scientific guidelines, published in the Federal Register on April 11, 1988 
(19). “At a minimum,” marijuana and cocaine were to be included in govern- 


Sensitive jobs” were defined as any requiring a high degree of trust and confidence. Positions 
involving national security, handling sensitive documents or serving the President, enforcing the 
law, or protecting life, property, public health, and safety met this definition, which is believed to 
encompass 400,000-500,000 jobs (R. Harwood 1991, personal communication). 
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ment testing programs; opiates, amphetamines, and phencyclidine (PCPs) 
were optional; and testing for other drugs listed in Schedule I or II of the 
Controlled Substance Act required permission of DHHS. The five drugs for 
which testing was mandated and/or authorized came to be known as “the 
NIDA five.” Alcohol—the most widely used of all psychoactive drugs, and 
by far the most costly from a social and health perspective—was not men- 
tioned, although it has appeared on some (but not all) subsequent lists. 


The Content of Federal Regulations and Guidelines 


The DHHS guidelines also outlined collection protocols to discourage 
adulteration of urine specimens, procedures on “chain-of-custody” to protect 
against misidentification of specimens or results, and measures to assure the 
quality of laboratory testing. In addition, they created a new title and role for 
physicians: “Medical Review Officers.” These physicians would review posi- 
tive urinalysis test results, interpret their significance, and explore mitigating 
medical circumstances that might exonerate the employee in question. 

Much of the administrative apparatus developed in connection with testing 
of government workers was later extended and elaborated in regulations from 
the Departments of Defense (18) and Transportation, (20) which required 
testing of employees of companies with defense contracts or those involved in 
transportation and employees of the Nuclear Regulatory Commission (NRC) 
(51), which stipulates fitness for duty requirements in nuclear reactor work. 
The various federal regulations are far from uniform, though, thus creating 
compliance headaches for large companies with diverse operations that fall 
within the ambit of multiple regulatory agencies. 


The Drug-Free Workplace Act of 1988 


These provisions were consolidated and extended when Congress enacted the 
Drug-Free Workplace Act of 1988. The act covered all federal grantees 
(including universities) and most federal contractors (including defense con- 
tractors), so that employees “providing services or products to the government 
should not be held to a lower standard than federal employees with whom the 
contractors work side-by-side” (9, p. 512). Although it does not require drug 
testing, the act does lend legitimacy to tougher approaches to drugs and 
provides a specific rationale for drug testing, as well as for strictly sanctioning 
employees convicted of drug infractions at work. 

This cursory review of the emergence of drug testing as public policy 
shows how decisive a role the federal government has played. Early problems 
that surfaced because of invalid and unreliable drug testing procedures led the 
Executive Branch to devote much of its energy to developing a tight adminis- 
trative system of quality controls. This effort has been quite successful, but 
has not spoken to more fundamental questions of social benefits and costs. It 
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has taken a succession of court challenges to begin enlarging the terms of the 
public debate. 


The Unfolding Legal Framework 


Since the guarantees of the US Constitution constrain principally actions by 
the state, the legal battleground over drug testing has centered on government 
agencies, as well as private employers acting on federal or state rules that 
require or authorize drug testing (62). The primary constitutional impediment 
to drug testing is the Fourth Amendment’s prohibition on unreasonable 
searches and seizures. The Supreme Court has long recognized that the 
collection and subsequent analysis of biological samples are “searches” under 
the Fourth Amendment (59). The question is whether the analysis of blood, 
urine, or breath for illicit substances is “unreasonable” within the meaning of 
the Fourth Amendment. What is reasonable “depends on all the circumstances 
surrounding the search or seizure and the nature of the search or seizure itself” 
(65). Thus, the permissibility of a particular search “is judged by balancing its 
intrusion on the individual’s Fourth Amendment interests against its promo- 
tion of legitimate governmental interests” (17). Minimally, the courts have 
required “reasonable suspicion” that the illicit substance or contraband to be 
searched for will, in fact, be found. 

The Supreme Court in the drug testing cases held that when the state has 
“special needs beyond the normal need for law enforcement,” the warrant and 
probable or reasonable cause requirements may become impracticable. In 
Skinner v. Railway Labor Executive Association (62), the Supreme Court 
upheld a regulation that required drug and alcohol tests following major train 
accidents or incidents and authorized these tests for covered employees who 
violate certain safety rules, even without reasonable suspicion that any par- 
ticular employee may be impaired. The Court viewed the government’s 
interest in safety of the traveling public as “compelling.” An individualized 
suspicion requirement would “impede the railroad’s ability to obtain valuable 
information about the causes of accidents or incidents and how to protect the 
public . . .” (62, p. 1406). 

The Court held that when balanced against the state’s compelling interest, 
the drug tests represented a minimal imposition on the workers’ privacy and 
bodily integrity, in part because the samples were furnished in a medical 
environment without direct observation. The workers’ expectations of priva- 
cy, moreover, were diminished by their participation in an industry regulated 
pervasively to ensure safety. The Court also rejected the argument that the 
testing program was unreasonable because it could not measure current 
impairments; drug tests are “designed not only to discern impairment but to 
deter it” (62, p. 1407). 

The Supreme Court followed a similar approach in National Treasury 





204 WALSH, ELINSON & GOSTIN 


Employees Union v. Von Raab (46), in which it upheld the constitutionality of 
suspicionless drug testing by the US Customs Service. The government’s 
“compelling” interest in safeguarding borders and public safety outweighed 
the diminished privacy expectations of employees who are directly involved 
in interdiction or who are required to carry firearms. No evidence was 
presented to the Court that a drug problem existed in the Customs Service, 
that a preannounced suspicionless drug testing program was effective in 
detecting or deterring drug use, or that drug use impugned the integrity or 
interfered with the judgment of officers. 

The question that emerges from the Supreme Court drug testing cases is 
how will the courts in the future balance the employee’s interest in privacy 
with “compelling” state interests? The Supreme Court in Von Raab identified 
specific factors that minimize the program’s intrusion on privacy, such as 
nonrandom tests, advance notice of a sample collection, and confirmatory 
tests. Court decisions subsequent to Von Raab have not used any single 
privacy factor as conclusive. The fact that a testing program is random has not 
persuaded the courts to undertake a fundamentally different analysis from that 
pursued by the Supreme Court (28). 


Legal Constraints on Private Sector Employers 


Private employers without significant government contracts or regulated lines 
of business enjoy greater freedom in their design and implementation of drug 
testing programs. However, they do face legal restraints embodied in state 
constitutions, federal and state statutes, common law, and collective bargain- 
ing agreements. Some state constitutions (California’s, for example) constrain 
even the purely private sector. 

Federal and state laws prohibit employment discrimination against persons 
with disabilities. The Americans with Disabilities Act of 1990 specifically 
excludes from protection any employee or applicant “who is currently engag- 
ing in the illegal use of drugs . . .” (2, Secs. 104, 510). Although the Act 
generally prohibits medical testing and examination of employees, if such 
testing is not job related and consistent with business necessity, the Act does 
not include drug testing. Indeed, the Act does nothing to encourage, prohibit, 
or authorize drug testing of job applicants or employees. The Federal 
Rehabilitation Act of 1973 (which prohibits discrimination against persons 
with disabilities only if an employer is in receipt of federal funds) also 
excludes current drug users from its protection, but does not have the same 
explicit exclusion of drug testing. 

Many states and a few municipalities have enacted statutes or ordinances to 
regulate drug testing. The legislation varies in application and approach. Most 
statutes cover all drug testing, but a few protect only public employees, and 
most stress procedural fairness, privacy, and fair use of positive test results. 
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In summary, worksite drug screening has advanced rapidly in the last 
decade. Beginning as an isolated policy in the military, it has been expanded 
by increments into many areas of federal employment, then into government 
contractors, and finally into the private sector. The 1989 Supreme Court cases 
may limit future testing by federal and state agencies and their contractors to 
specific safety and security oriented jobs. Similar limitations on most of the 
private sector are likely to rest on state legislative and judicial initiatives, if 
they develop. 


HOW SERIOUS IS THE PROBLEM OF DRUG ABUSE AT 
WORK? 


If the nation’s drug drama is being played out mostly in the shadow economy, 
outside of conventional places of work, then screening workers for drugs may 
be tantamount to looking under the lighted street lamp for the key that was lost 
somewhere else in the dark. To what extent this is the case is unknown. 
Overall rates of illicit drug use have been decreasing in recent years (33), 
despite pockets of problems in inner cities depicted in popular films and 
almost daily in metropolitan newspapers. But, empirical data on rates of drug 
use on or around the job are limited. If there are concentrations of serious drug 
problems in places of work, they have yet to be well characterized in 
systematic research. 

Research on drug use at work is scarce and difficult. Employers have much 
to lose if they develop a reputation as a place where workers are abusing 
drugs. And, for a worker to admit using an illicit drug is to confess to an 
offense that could result in job loss or even arrest. Cross-sectional surveys can 
be conducted successfully with strong guarantees of anonymity, but even they 
require elaborate negotiations and a slow process of building trust. Reported 
rates of drug use among workers are low enough that samples have to be large 
to support multivariate analyses and meaningful statistical inferences (68). 
Most data on drug abuse come from national surveys, not from studies 
anchored in places of work, and few even differentiate workers from 
nonworkers, much less delve into which workers doing what kinds of work 


use drugs, and who uses drugs in the context of work, when, and with what 
effects. 


National Trends in Drug Use 


Since 1972, NIDA has been conducting a national survey on drug abuse. Well 
before President Reagan’s 1986 executive order initiated drug testing of 
federal workers, such national surveys began to pick up signals that rates of 
illicit drug use were starting to decline. Between 1979 and 1988, rates of 
reported use among 18- to 25-year-olds in the household survey dropped from 
37.1% to 25.7% (a 31% decrease) (45). 
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At the same time, drugs continue to be implicated in festering problems in 
the nation’s inner cities: drug-related homicides, felony convictions, bystand- 
er deaths in shootings, domestic violence, emergency room admissions, HIV 
transmission, low birth weight, and multiply complicated deliveries (74). 
But, the 1988 NIDA household survey revealed that overall per capita con- 
sumption of cocaine (especially “crack”’) has declined in every age group, and 
that crack users are twice as likely to be unemployed as to be full-time 
employees (45). 

Another NIDA-funded national study’s annual survey and follow-up of 
high school seniors, conducted since 1976, has been corroborating the im- 
pression of declining rates of drug abuse in the general population (33). The 
implications for the workplace, however, are not well understood. 


Drug Use Among Workers 


A simple way to begin asking about workplace effects is to examine the 
employment status of respondents in national surveys. Voss (66) conducted a 
secondary analysis of data from the 1985 NIDA household survey and 
ascertained that marijuana use in the past month was reported by 11.7% of 
respondents who were employed full-time, and by 10.2% of part-time em- 
ployees. It was the unemployed who reported the highest rates (21.5%) of 
marijuana use. Comparable rates of cocaine use among full- and part-time 
employees were 4% and 2.2%, respectively, compared with 6% among the 
unemployed. Data on the employment status of drug users still leave un- 
answered the question of drug use on the job or the effect of drug use on the 
conduct of work (49). 

The limited data suggest that we actually know almost nothing about the 
impact of drug abuse in particular worksites. That being the case, we need to 
look elsewhere for the underpinnings of drug testing programs at work. 


Public Perceptions of the “Drug Problem” and Attitudes 
Toward Worksite Testing 


Americans are frequently told that the drug problem is among the most serious 
that society faces. These conclusions are derived from public opinion polls 
that are widely quoted in the lay press. Methodological limitations tend to be 
glossed over, and the strong impression is left that the general public, 
management, workers, and selected working groups are very worried about 
drugs. 

Furthermore, we are told, employees and managers perceive drug use at 
work to be a real concern (3, 8, 42) and are generally supportive of drug 
testing, particularly for cause and for safety-sensitive positions (31, 32, 36). 
Flaws in the sampling and data collection procedures of many of these surveys 
cast doubt on their reliability, validity, and especially their generalizability. 
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Although a few representative national polls have been done, it is still unclear 
how broad, deep, and stable the support for drug testing really is. 

For example, organizations like the Institute for a Drug-Free Workplace 
and the American Productivity and Quality Center commission and quote the 
results of surveys that purport to show that American workers and managers 
are very concerned about the drug problem and quite willing to be tested for 
drugs. In 1990, the Institute (32) reported that 28% of employees interviewed 
said that drugs were the greatest problem facing the United States today, 49% 
said that illegal drug use occurs in their own workplaces, 22% called illegal 
drug use at least “somewhat widespread” where they work, and 41% said drug 
use by employees “seriously affects” getting the job done. A full 97% of 
respondents were said to favor drug testing under some circumstances: 75% 
for cause, 68% at preemployment, and 53% on a random basis. 

These findings tend to be reported with little or no contextual information 
on basic methodological issues: how the questions were posed, how the 
samples were drawn, what the response rates were, and so on. The organiza- 
tions sponsoring the polls often have vested interests that are not immediately 
evident. The Institute for a Drug-Free Workplace, for example, comprises 
representatives of large corporations, many already with drug testing pro- 
grams firmly in place. Hoffman-La Roche, a major competitor in the drug 
testing market, is an especially active member. 

This is not to say that the public opposes drug testing at work, only to raise 
concerns about the quality or strength of the evidence. Several news organiza- 
tions have released more reliable opinion polls that do tend to support findings 
in favor of drug testing. A CBS/New York Times survey, conducted in 
September 1989, found that 61% of respondents would favor a policy that 
required “workers in general to be tested to determine whether they have used 
illegal drugs recently . . .” (58). A May 1986 poll (Decision/Making/ 
Information for Populus, Inc.) found that 88% of respondents supported 
mandatory testing among airline pilots, 85% among police and law enforce- 
ment agents, and 74% among teachers (58). Other, more limited surveys have 
reported similar results (3, 36). 

What seems to be operative here is a tendency that historians and students 
of the social construction of reality have observed in the cyclical nature of 
public attitudes toward various kinds of drug use. Tolerance of drugs seems to 
be shaped as much by larger social and economic forces, and by conservative 
or liberal strains in the general social mood, as by the objective reality of the 
prevalence or impact of the problem itself (27, 44, 71). An historical perspec- 
tive suggests that the United States had already begun to enter a new period of 
public reaction against drug use before testing became widespread in the 
workplace. With or without testing programs, if history is an accurate guide, 
those trends would likely continue (44, 72). To believe that drug testing has 
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played a significant role in promoting or accelerating this process, we would 
need convincing evidence of direct impacts that these testing programs are 
having. 


HOW WIDESPREAD ARE DRUG TESTING PROGRAMS? 


Data on the prevalence and distribution of drug testing programs in public- 
and private-sector employment are thin. Most available surveys have been 
weak; federal requirements are complex and require no formal reporting; and 
drug testing programs are diverse, protean, and sometimes different on paper 
than in practice. The evidence suggests the following tentative conclusions: 
much (but noi all) of drug testing has been precipitated by federal regulation; 
testing is much more common in large, rather than small, companies; preem- 
ployment screening of job applicants is the most common form of testing; 
testing of current employees is conducted mostly for cause; and random, 
unannounced testing not based on suspicion is still relatively rare. 

Coverage of federal mandates for drug testing ought to provide some sense 
of the number of employees who are subject to testing, but no one data base 
currently exists to provide actual numbers on how many workers are covered 
by the Departments of Transportation and Defense and NRC regulations, how 
many actually are tested, and under which circumstances. In principle, the 
Drug-Free Workplace Act has sweeping ramifications for private-sector em- 
ployment, but the extent and nature of responses to date is undocumented. 

Growth in the numbers of commercial drug testing laboratories or in their 
volume of sales might be an objective indicator of the diffusion of testing 
programs, but no unified accounting system provides access to such data. 
Federal guidelines have been established for all laboratories that test federal 
employees (23), and NIDA conducts a program to certify laboratory quality. 
As of 1990, 73 laboratories had been certified (D. Bush 1991, personal 
communication). Also, the College of American Pathologists (CAP) in- 
stituted a private-sector, voluntary accreditation program in 1987. The num- 
ber of accredited laboratories now stands at 82, up from 36 in 1988, with eight 
more under review (G. Hopewell 1991, personal communication). The extent 
of overlap between the NIDA and CAP programs is unclear, and there is no 
way to know how many nonaccredited laboratories have been attracted into 
the drug testing business as the market has expanded. 

Surveys are another way to assess the extent of drug testing in the work- 
place (3, 7, 22, 25, 29, 42). Most have been highly selective in their sampling 
frames, insufficiently alert to bias associated with low response rates, and 
elliptical in their reports of methodological detail. An exception is the 1988 
Bureau of Labor Statistics (BLS) survey (7), which has been recently updated 
(29). This is the most authoritative source of data on the extent of drug testing 
programs in private employment. 
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Of “establishments” (contiguous worksites) surveyed in the BLS study 
(1988 data; published in 1989), 3.2% had a drug testing program, and the 
prevalence increased with employment size: Among establishments with 1000 
or more employees, 43% had drug testing programs, but only 2.7% of those 
with less than 50 workers at the reporting site had programs. This has 
important implications for the reach of drug testing programs, as large 
establishments (with more than 1000 employees) account for almost 16% of 
the workforce, but more than 90% of the nation’s establishments have fewer 
than 50 workers, which accounts for more than one third of all American 
workers (7). 

The BLS survey underscored that testing is most commonly conducted on 
job applicants. Of establishments with programs, 85.2% (or 123,881) tested 
job applicants (usually applicants for all jobs), whereas 63.5% (92,000) tested 
current employees. Two thirds of companies who tested current employees 
did so only “for cause,” not on a regular or random basis. All together, the 
establishments with programs tested under | million employees, or 1% of the 
private workforce, in the year before the survey. 

The 1990 BLS update involved a random sample of close to 800 respon- 
dents to the 1988 survey; 749 were still in business (29). Overall, drug testing 
had increased insignificantly, from 3.2% to 4.4%. In larger establishments, 
which employ 250 or more workers, the increase in rates of drug screening 
(from 31.9% to 45.9%) was substantial and statistically significant (29). In 
addition, almost one third of programs reported in 1988 had been discontin- 
ued by 1990, more often in small and medium establishments than in those 
with 250 or more workers. 

Some published accounts have estimated much higher rates of testing than 
those found by the BLS (for example, see Ref. 30). Operational definitions 
are often unspecified, so that the reader cannot tell, for example, whether 
drug tests on military and government employees are included in estimates, 
the types of testing they include, whether they refer to numbers of employees 
eligible or to those actually tested, and whether the unit of analysis is numbers 
of tests or numbers of employees tested. Considering all of this, it seems safe 
to conclude that as a proportion of the entire American labor force, the 
number of workers who have been asked to submit a urine sample for a 
worksite drug test is small. The actual number is unknown, as is the rate of 
increase. 


HOW EFFECTIVE ARE DRUG SCREENING PROGRAMS? 


The effort expended to date on assessing the effectiveness of drug testing 
programs has centered on the accuracy of systems to collect and analyze urine 
samples and to report and interpret subsequent results. A more complete 
assessment of the effectiveness of drug testing as policy would take account of 
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probabilities all the way down a longer (and messier) decision chain: the 
likelihood of false-positives and false-negatives in various laboratories; the 
chance of errors along the chain-of-custody; the danger of misinterpreting 
(technically correct) laboratory test results; the even more complex questions 
about the impact of testing programs on subsequent drug abuse; the correla- 
tion between a positive drug test and future fitness to perform on the job; and, 
finally, larger unanswered questions about whether drug testing programs 
actually identify the drug users most likely to experience serious problems on 
the job. 


The Accuracy of Laboratory Drug Tests 


The National Institute on Drug Abuse’s effort to develop a hierarchical system 
of quality controls has been successful. The full system includes an initial, 
inexpensive screen, which uses chain-of-custody control procedures, fol- 
lowed by more costly confirmatory testing of all positives and a medical 
review of the few remaining confirmed-positive cases. When the system is 
implemented correctly, the general consensus now seems to be that sensitivity 
and specificity are very high. Procedural shortcomings and human error can 
always reduce accuracy, and private-sector testing programs not conducted 
under federal regulation still have enough discretion to omit crucial elements 
in NIDA’s ideal system of checks and balances. 

Without confirmatory testing and medical review, the predictive validity of 
drug screening is unacceptably low. Positive predictive validity refers to the 
percentage of those individuals who tested positive who truly had traces of 
drugs in their urine. This is an especially important consideration in the 
workplace, as a false-positive error can trigger serious harm, including dam- 
age to reputation and denial or loss of a job. Predictive validity is affected by 
the prevalence of the condition in a particular population. The prevalence of 
drug use in samples of workers is low, so caution is needed (13). 

Because predictive validity depends on specificity and prevalence (27, 73), 
the most commonly used drugs are most likely to be identified in drug testing 
programs. Therefore, tests for these drugs should have relatively high accura- 
cy. Unfortunately, the predictive validity for marijuana is approximately 
38%, because the sensitivity and specificity of its tests are not high. With a 
3% prevalence of cocaine, the predictive value is approximately 35%. Even 
with a sensitivity and specificity of 95% for cocaine testing (an accuracy rate 
not always achieved by all laboratories), in a population with a prevalence 
rate of 3%, the positive predictive value would be less than 50%: Of four 
positive samples, two or more would be false-positives, if proper con- 
firmatory steps were not taken. 

False-positive results usually reflect cross-reactivity or human error, or 
both (6, 53). Cross-reactivity occurs when chemicals other than the ones of 
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interest provide the same reaction in a drug test (e.g. the analgesic ibuprofen 
and some nonsteroidal inflammatory agents may mimic illicit drugs) (6). 
Confirmatory tests, which use gas chromatography/mass spectroscopy (GC/ 
MS) techniques, can rule out cross-reactions with near-perfect accuracy (38). 
All federally regulated drug testing programs are required to use back-up 
GC/MS, but how widely these expensive tests are run in private sector 
programs, especially at preemployment, is not known. The same is true of the 
chain-of-custody and medical review procedures built into federal programs. 
The Navy, according to one estimate, allocates a full 20% of the total costs of 
its testing program to quality controls as a hedge against the inevitable human 
errors (38). To what extent employers in the private sector ensure test 
accuracy is unclear. 

Some states have passed legislation to anticipate potential problems that 
inaccurate testing can create. According to a survey conducted by the US 
General Accounting Office in 1988 (23), 11 states had specific statutes and 
regulations to govern laboratories that do employment drug testing. These 
statutes and regulations vary considerably. Ten require confirmation of posi- 
tive tests, and seven specifically require GC/MS. Eight require chain-of- 
custody procedures on all urine specimens to reduce the likelihood of human 
error. 

Federal guidelines have been established for all laboratories that do drug 
testing of federal employees (23), and NIDA conducts its laboratory certifica- 
tion program. However, many small laboratories scattered throughout the 
United States elude the scrutiny of NIDA and other certifying agencies, and 
some employers apparently do their own drug screening on-site. One estimate 
placed the number of drug screening reagent sales to nonlaboratory customers 
at more than $11 million in 1987 (4), which accounts for an estimated 15% of 
the drug screening market, the accuracy of which is unlikely to be high. Two 
pieces of pending federal legislation have recently been introduced in Con- 
gress, in the hope of setting minimum federal standards for all drug testing 
programs (D. Crouch 1991, personal communication). 


Likelihood of Detection 


Even when tests are technically accurate, it is generally felt that drug users 
can find ways to subvert the tightest system. If the testing is preannounced, 
casual users can avoid drugs in anticipation of a job interview or a testing 
program at work (26). Random, unannounced drug testing adds an element of 
surprise, but is generally conducted so infrequently that the odds of identify- 
ing any one drug user remain low (26). Testing for cause may be the only 
circumstance in which the likelihood of discovery is high (26), but the 
discovery occurs after something has already gone wrong. 

Reports of adulterated urine specimens make good copy; whether they are 
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exaggerated is hard to know. Drug screening is said to select for the less 
experienced drug users (1, 63). Powdered drug-free urine is available for sale 
(1, 63), as are guidebooks with practical hints on how to outmaneuver an 
employer’s drug detection screen (14, 37). 


Interpretation of Test Results 


A laboratory report is marked positive when the urine specimen has been 
found to contain an amount of drug equal to a certain threshold concentration, 
or cutoff level (14), usually set higher than the detection limit of the drug test 
to avoid false-positives (14). The Department of Health and Human Services 
has standard cutoff levels for employers governed by the mandatory guide- 
lines. On their own, employers often establish different cutoffs: In a six-state 
survey of union workplaces in the mid-South, only 24% of private companies 
that do drug testing used the federal cutoff guidelines (40). 

The most important conceptual distinction, however, is between a positive 
test and evidence of functional impairment. A positive result indicates expo- 
sure, but not a pharmacological effect, and many of the metabolites that are 
traceable in urine are still detectable long after psychoactive effects have 
abated (40). Marijuana metabolites, for example, may still be detectable in 
the urine of frequent users for as many as 21 days after the last use; the dwell 
time for PCPs is roughly eight days and for cocaine, two to three days. 
Further, there is no definitive evidence that illicit drugs impair job perfor- 
mance more than do other exigencies or distractions, such as lack of sleep, 
chronic conditions, or emotional distress. 

Many laboratory experiments and simulations have shown that alcohol and 
other drugs do impair coordination and performance, but how these findings 
translate to work situations is less well understood (43). It is generally 
acknowledged that alcohol, marijuana, and certain other drugs can seriously 
impair judgment and reaction time and should be avoided while doing any- 
thing (on or off the job) that cannot be done safely without concentration or 
coordination. 

However, two post-mortem studies of industrial fatalities found no con- 
vincing evidence of excess drug involvement, although the studies were 
methodologically weak. One examined case files of a Florida county medical 
examiner (15). In 147 instances of fatal industrial injuries, 50% were tested 
on autopsy for traces of drugs. Of those, only 15% had a positive drug test, 
which suggested possible drug impairment. The other autopsy study (35) 
found that only 7% of 172 workers killed on the job had detectable levels of 
drugs that might have impaired their physiologic functioning. In both studies, 
the offending substances were more often alcohol and prescription drugs, 
rather than the illicit drugs that most worksite screening programs emphasize. 

In studies of performance per se, the evidence is again equivocal. In one 
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study, simulated performance of complex tasks by airline pilots was affected 
by smoking marijuana a full 24 hours before the experiment (76); but, in 
another study, women smoked as many as a dozen marijuana cigarettes every 
day for three weeks, with no measurable effect on their output of work (41). 


Does Drug Testing Affect Drug Use or Related Problems at 
Work? 


When the evaluative questions shift from the mechanics of testing to the 
impact of overall programs, the literature consists primarily of descriptive 
reports from the field (12, 48, 52, 64). Often, these reports include rates of 
positive tests over time, but rarely have adequate historical or comparative 
controls to inspire confidence that observed changes might have been partly 
produced by a drug testing program. 

A report from the Southern Pacific Railroad Company went a step further 
and examined outcome data (64). In 1984, the company instituted preemploy- 
ment drug screening, regular testing at the time of a periodic physical 
examination, and “for cause” testing of urine samples for drugs and alcohol. 
Between 1983 and 1988, the number of train accidents per million miles 
traveled declined from 22.2 to approximately 2.2, and total accidents dropped 
from 2234 to 322. The percentage of employees who tested positive for drugs 
or alcohol in these programs declined substantially, from approximately 23% 
in 1984 to around 5% in 1988. Marijuana was the most commonly detected 
drug over the five years; in 1984, 53.8% of positive results were marijuana. 
Positive alcohol tests jumped from 12% of all positive results in 1984 to 24% 
by the first half of 1988. What, if any, role the drug testing program played in 
the changes observed (and the meaning of the alcohol finding) are a matter of 
conjecture without a comparison group. 


Does Drug Testing Predict Subsequent Job Performance? 


To believe that preemployment screening programs effectively select out 
potential employees who would otherwise be impaired by drugs, we need 
evidence that a positive drug test is a good predictor of future job perfor- 
mance. A total of six studies (three peer-reviewed) have obliquely addressed 
this question. Again, the overall results have been mixed. 

Blank & Fenton (5) conducted an unmatched comparison study of 500 male 
Navy recruits who tested positive for marijuana and 500 who tested negative. 
The study compared demographic characteristics and attrition patterns in the 
two groups 2.5 years after intake and found no significant differences in age, 
marital status, and home of origin. Significant differences were found in 
education level, score on a Navy qualification test, and race; the marijuana 
users had lower education and qualification scores and they were also more 
likely to be nonwhite. In terms of retention, 81% of the negative-test group, 





214 WALSH, ELINSON & GOSTIN 


but only 57% of positive-test recruits, were still in the Navy after 2.5 years. 
To what extent the different retention rates were a function of the history of 
marijuana use, other preexisting differences in the two groups, subsequent 
surveillance of recruits who tested positive, negative labeling at the time of 
enrollment, and/or other combinations of factors is impossible to unravel. 

Parish (54) conducted a blind prospective study of 180 hospital employees 
to assess how well a preemployment drug test result correlated with perfor- 
mance after 12 months on the job. All employees hired over a six-month 
period were screened for ten substances; all positive tests were confirmed, but 
none were used in hiring decisions. Potential employees were aware that their 
urine would be tested, and there was no observation of specimen collection. 
Researchers who were blind to drug test results extracted from personnel files 
information on job evaluations, disciplinary actions, promotions, com- 
mendations, terminations, and absenteeism. 

The analysis compared those employees who tested positive and negative. 
No differences were found in job retention, supervisor evaluations, and 
reasons for termination. Eleven employees who were drug free at intake were 
fired from their jobs, whereas none were fired from the test-positive group, a 
challenging finding. The numbers were small (12%, or 22 employees tested 
positive), so the statistical power was low, and a Type II error (the inability to 
detect a true difference) could have occurred. 

Zwerling et al (77) conducted another prospective blind study of preem- 
ployment drug testing among postal workers, who provided a larger sample 
(N=2537). EMIT was used as the screen, followed by confirmation by 
GC/MS. A quality control test of spiked samples indicated 90-100% sensitiv- 
ity and 100% specificity of the laboratory tests. Employees, hiring officials, 
medical personnel, and management were all blinded to the results of the 
urine tests. No significant association was found between the decision 
whether to hire applicants and the (undisclosed) results of their urine tests. 
The prevalence of drug positives was 7.8% for marijuana, 2.2% for cocaine, 
and 2.2% for other or multiple drugs. 

At follow-up more than one year later, the two groups were compared after 
controlling for age, sex, smoking and exercise status, race, and job classifica- 
tion. Employees who had tested positive for marijuana at preemployment 
were significantly more likely to have been terminated from their jobs [rela- 
tive risk (RR), 2.07]. The test-positive group had a higher rate of absence 
(RR, 1.56) and time to first injury (RR, 1.85), as well as earlier time to first 
accident (RR, 1.55) and to disciplinary action (RR, 1.55). Cocaine-positive 
workers had significantly greater risk of earlier first injury (RR, 1.85) and 
absence rate (RR, 2.37), but cocaine did not predict termination, time to first 
accident, or time to first disciplinary action. 

Normand et al (50) conducted the most thorough evaluation of drug testing 
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to date, a follow-up of 4396 new postal service employees who had provided 
urine samples at preemployment interviews. The samples were tested for 
eight classes of drugs (amphetamines, barbiturates, benzodiazepines, canna- 
binoids, cocaine, methadone, opiates, and PCPs), and all positive EMIT 
readings were confirmed by GC/MS. The results were withheld from every- 
one except the research staff. 

In contrast to the Zwerling study, this one found that job applicants who 
tested positive were less often hired (81% of test-negative applicants were 
hired, compared with 73% of test-positive, even though the information on 
their testing status was unavailable for the hiring decision). Of applicants who 
were hired, 5.7% tested positive for marijuana, 2.2% tested positive for 
cocaine, and 0.9% tested positive for all other drugs combined. 

Workers who did and did not test positive at preemployment were com- 
pared on absenteeism, voluntary and involuntary separation, and injury and 
accident rates after just over one year of employment. Postal workers who 
tested positive for both cocaine and marijuana were placed in the cocaine- 
positive group for purposes of analysis. 

Compared with the group of workers who tested negative for any drug, the 
marijuana-positive group was 1.5 times more likely and the cocaine-positive 
group 4.29 times more likely to exhibit heavy absenteeism over the 1.3 years 
of follow-up. These were odds ratios, which were statistically significant at 
the 0.01 level. The odds of voluntary separation were the same irrespective of 
testing status, but involuntary separation was 1.55 times as likely (and 
statistically significant at the 0.01 level) in the group testing positive for any 
drug and 2.4 times as likely in the cocaine-positive group (again, statistically 
significant). 

A logistic regression analysis controlling for age, sex, and job category 
revealed that a positive test result significantly predicted absenteeism. After 
controlling for job category, the analysis predicted involuntary separation. 
The investigators speculated that the associations probably underestimated the 
true relationships between drug use and job performance indicators, owing to 
misclassification, measurement error, construct invalidity, and other factors. 

After assessing the predictive capability of a preemployment testing pro- 
gram, Normand et al (50) also conducted an analysis of costs versus benefits. 
This analysis produced an estimated cost savings for the drug testing program 
(through absenteeism and turnover costs) of $52,750,000 for one (annual) 
cohort of new employees, a figure the investigators considered an un- 
derestimate of all the potential savings. The authors were careful to note that 
their study did not purport to establish a causal link from drug use to 
absenteeism and the other performance indicators. They argued that a positive 
drug test may stand as a proxy for a whole complex of factors, including 
personal characteristics and lifestyles, that combine to produce a worker who 
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is significantly less likely to perform well, whatever the proximal cause. 
Work by Kandel & Yamaguchi (34) would tend to support this assumption as 
it relates to job separation in young adults. 

Among the indicators of performance used as outcome measures in these 
studies, absenteeism may be more objective and reliable than job termination 
or supervisor assessment because the latter measures include an element of 
subjectivity. Absenteeism, however, is not often documented as well and 
operationalized as carefully as it needs to be to support rigorous evaluation 
research (67). As an outcome, it does tend to yield greater statistical power 
than some alternatives (e.g. injuries or job terminations) that occur less 
frequently. Only two of the evaluations of drug testing in the workplace (50, 
77) had sufficient statistical power and they both found significantly higher 
absenteeism rates among the groups of employees who tested positive for 
drugs in the preemployment screen. 


Do the Benefits of a Drug Testing Program Outweigh 
the Costs? 


Other than in the postal worker studies, information on the costs of drug 
testing programs is thin. The Navy is believed to spend approximately 
$90-$100 per specimen on drug testing (including collection, transportation, 


and analysis) (10), which, at 2 million tests a year, would sum to approx- 
imately $190 million, just in test handling costs. Private companies pay 
$15-$30 per specimen for EMIT or radioimmunoassay, depending on volume 
and location (40). A GC/MS confirmation costs $35—$100 per specimen (40). 
In addition, there are initial start up costs, costs of staff time, legal fees, and 
time off from work, as well as difficult-to-quantify indirect costs. Whether 
these costs are justified depends on the benefits achieved. 

Too little valid evaluation research has been conducted to date to support 
cost-benefit analyses of drug screening programs. Two studies are widely 
quoted, one conducted at the Utah Power and Light Company (16), the other 
at Georgia Power and Light (unpublished). In both cases, selection bias and 
insufficient attention to competing explanations for presumed program effects 
cast doubt on positive returns on investment both studies purported to show. 

None of the cost-benefit analyses conducted to date compares drug testing 
with alternative methods for deterring employees from using drugs, or 
alternative methods for screening out workers who may be unfit for duty on 
any given day. As just one example, a California-based company, Perfor- 
mance Factors, manufactures and markets a computerized job performance 
testing machine called “Factor 1000” (55). When employees in safety- 
sensitive jobs report to work daily, they are required to perform tasks on a 
computer that tracks their hand-eye coordination and reaction time (but not 
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critical activity or judgment) against their own baseline, which is constantly 
updated in the computer file as their skill improves. The company asserts that 
Factor 1000 can detect impairment, irrespective of cause (prescription or 
illicit drug use, alcohol, severe stress, fatigue, or illness). Customers include 
the National Highway Traffic Safety Administration and several private 
firms. A serious study of Factor 1000, and/or similar devices, might compare 
their performance with that of a drug testing program in a randomized 
controlled trial. 

For policy analysis, an adequate cost-benefit framework should encompass 
a variety of alternative expenditures designed to address a given problem. No 
studies have compared the relative costs and benefits of worksite drug screen- 
ing with more investment (for example) in primary prevention, health promo- 
tion, or employee assistance in the workplace or in fuller coverage for 
treatment of drug-abusing employees. No analyses have widened the 
framework to ask (as one of many examples) whether an equivalent ex- 
penditure in treatment for pregnant cocaine addicts would yield a greater 
social payoff, all things considered. These are the kinds of questions that need 
thoughtful consideration. 


QUESTIONS FOR THE FUTURE 


Technological advances and administrative innovations have made worksite 
drug screening almost perfectly accurate, when structured according to pro- 
tocol. The protocol virtually removes from consideration the danger of falsely 
accusing a worker of having used an illicit drug. More surveillance is needed 
to ensure that the protocol is being followed wherever employers are testing 
for drugs, but most of the technical questions have been addressed and can be 
resolved. 

What remain now are questions about what it means to learn that an 
employee has a trace of an illicit drug in his or her urine, what the social costs 
and benefits are of expending the resources to find that out, and what 
subsequent actions are justified if the goal is to enhance the health and 
productive capacity of the American labor force. We know almost nothing 
about what happens to applicants denied employment because of a positive 
drug test, or even whether they are told why they were not selected for the 
job. We have no information on how many (if any) employees have lost their 
jobs because of drug testing and what has been their fate. We have no 
information on how workers have been affected by drug testing programs. If 
some have been referred for treatment, we have no knowledge of how they 
have fared and whether the drug test helped or hindered their recovery. If 
some were not referred to treatment, we do not know if they are still 
employed. 
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We have only unsubstantiated assertions that worksite drug testing pro- 
grams have changed beliefs, attitudes, and values concerning drugs and work. 
We have no knowledge of their impact on labor relations and other aspects of 
the employment relationship. No one has gauged whether they have had an 
impact on other health, safety, and mental health programs in the workplace; 
whether they have further complicated delicate relationships between occupa- 
tional physicians and employees (69); and whether they have strengthened or 
shackled EAPs. If drug screening programs have had a beneficial or adverse 
effect on employees’ motivation, trust, or morale, we have no way to know, 
although we have reason to care, in light of growing concern about American 
competitiveness in world markets. Policymakers and lawmakers have no way 
to know how often, to what extent, and in which ways drug screening 
programs may be violating rights of privacy and due process of law. Nor do 
they know if these programs support or undermine general feelings of autono- 
my, community, good faith, and decency in places of work. Legislative and 
regulatory restrictions may or may not have their desired effect. Employees in 
large firms are often protected by multiple layers of law and regulation, 
whereas those in smaller and nonunion shops and plants may have little such 
protection. 

The most important gap in knowledge about drug testing programs pertains 
to their effectiveness. We have little cogent evidence to support the supposi- 
tion that these programs reduce drug use in the workplace, improve perfor- 
mance and productivity, or produce other positive results. In the absence of 
such evidence, civil libertarians and labor leaders are asking whether these 
programs are anything more than symbolic politics, whether they are an 
implicit statement by the federal government that responsibility for health and 
productivity resides with workers themselves, and whether they deflect atten- 
tion from structural problems in the macroeconomy by implying that Amer- 
ican industry’s competitive problems are the fault of workers who use drugs. 

A rule of thumb in the clinical management of substance abuse could 
profitably be applied to the evaluation of drug testing at work: The least 
intensive, intrusive, and coercive approaches should be given a fair trial first, 
before ratcheting up to treatments with greater potential for harm. More 
invasive interventions should carry a heavier burden of proof that they hold 
genuine promise of doing more good than harm. Such tests have yet to be 
applied objectively to drug testing at work. 
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INTRODUCTION 


Immunization programs are recognized as one of the most cost-effective 
interventions of public health. However, even as late as the mid-1970s, it was 
estimated that less than 5% of children in developing countries were adequate- 
ly immunized against diphtheria, tetanus, pertussis, measles, poliomyelitis, 
and tuberculosis. 

In response to the tragic numbers of deaths due to these vaccine- 
preventable diseases, the World Health Assembly (WHA) of the World 
Health Organization (WHO) initiated the Expanded Programme on Im- 
munization (EPI) in 1974. The EPI forms the basis of a global effort to reduce 
morbidity and mortality from these six diseases by providing immunization 
services for all children and women of the world. 

In this article, we document the significant progress, the lessons learned, 
and the challenges facing global immunization efforts in the 1990s. And, we 
discuss many of the recommendations of the EPI Global Advisory Group 
(13). 


OVERVIEW OF PROGRESS 


For the first time, reported immunization coverage is surpassing the 80% 
mark for a third dose of polio or DPT vaccines for children in their first year 
of life (Figure 1). This represents a milestone towards universal childhood 
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Figure 1 Expanded Programme on Immunization, immunization coverage 1977 to 1990. 


immunization. However, the percentage of pregnant women who receive 
tetanus toxoid immunization to protect their newborns from neonatal tetanus 
is much lower. In developing countries, only 38% of pregnant women receive 
the two-dose primary series or a booster dose. It is also important to note that 
global statistics mask disparities among regions (Figure 2), countries, prov- 
inces/states, and districts (6). These immunization coverage levels, which 
reflect the varied development of the primary health care infrastructure, are 
one of the measures of the degree of equity and social justice that communi- 
ties have achieved. 

The progress in global immunization is directly attributable to the efforts of 
national governments, WHO, the United Nations Children’s Fund (UNICEF) 
and other UN agencies, bilateral development agencies, and nongovernmental 
organizations. The development of the capacity to achieve these levels of 
coverage of infants represents a major public health triumph for the 1980s. 

At the present levels of immunization coverage, an estimated 3.2 million 
deaths due to measles, neonatal tetanus, and pertussis are prevented annually. 
And, some 450,000 cases of paralytic poliomyelitis are also prevented (Figure 
3). The urgency to raise immunization coverage levels and focus on disease 
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Figure 2 Immunization coverage of children less than 12 months of age by WHO region, April 
1991. 


control is underlined by the occurrence of an estimated 1.8 million deaths 
each year due to these diseases and some 120,000 cases of paralytic 
poliomyelitis—all of which are preventable through immunization (6). 


LESSONS LEARNED 


Over the last 15 years, the global immunization effort has demonstrated that a 
global coalition, which shares common goals, can create an unprecedented 
degree of cooperation among a wide spectrum of national governments and 
international, national, and local organizations. 

The development of the EPI over these years has taught specific lessons 
that will continue to guide immunization programs in the 1990s: 


. Goals endorsed by the WHA help galvanize the international community 
to action; 

. Policies and strategies recommended by the EPI Global Advisory Group 
provide a common direction for immunization efforts (13); 

. Personal involvement of heads of state and the political, religious, and 
social leadership at all levels generates political will, creates demand for 
immunization services, and mobilizes communities to meet that demand; 

. Nearly all mothers and children of the world can be reached with im- 
munization services, and significant reductions in morbidity, disability, 
and mortality can be achieved; 

. Research and development in logistics, cold chain, injection equipment, 
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new and improved vaccines, delivery strategies, immunization schedules, 
and monitoring and evaluation methodologies (11) provide a technical 
basis for advancing program policies and strategies; 

. Emphasis on training in technical skills, planning, and management devel- 
op the needed human resources at senior, middle, and peripheral levels; 
and 

. Some diseases can be eradicated from the face of the earth; the cost of such 
an effort ultimately saves money by halting the need to immunize against 
the diseases or treat its victims. 


The progress and lessons learned provide optimism that the new challenges 
set by the WHA for global immunization programs in the 1990s will be met. 


GOALS 


In May 1989, the Forty-Second World Health Assembly set the 1990s agenda 
for the EPI in resolution WHA42.32 (16). Six major challenges to be ad- 
dressed during the decade were cited: 


1. Achieving and sustaining in all countries full immunization coverage with 
all the vaccines used by the EPI; 

. Controlling the target diseases, including reduction of measles by 90% 

compared with pre-immunization levels by 1995, elimination of neonatal 





GLOBAL IMMUNIZATION 


tetanus by 1995, and global eradication of poliomyelitis by the year 2000; 

. Improving disease surveillance to provide accurate assessment of the 
progress of the program; 

. Introducing within routine national immunization services new or im- 
proved vaccines as these become available for public health use; 

. Promoting other primary health care practices that are appropriate for the 
program’s delivery system and target populations; and 

6. Conducting research and development in support of the above. 


These challenges, among others, were dramatically reinforced in the Dec- 
laration on the Survival, Protection, and Development of Children, which was 
enunciated at the World Summit for Children held at the UN in September 
1990 (19). The plan of action associated with the Declaration brings the goals 
for children and development to the highest levels of political visibility. The 
international community must now move forward rapidly to use the momen- 
tum of this summit to translate the Declaration into the actions necessary to 
achieve these goals. 


PLANNING 


Development of immunization and disease control plans of action at global, 
regional, national, state/province, and local levels, with periodic review and 
revision, are necessary to set priorities on activities to meet the above 
challenges. Plans for achieving specific immunization coverage and disease 
reduction, elimination, and eradication targets should be part of an overall 
immunization plan of action, which, in turn, should be part of a primary 
health care plan. 

The establishment and effective functioning of interagency coordinating 
committees, which improve the coordination of donors in support of regional 
and national immunization programs, are useful in planning and integrating 
activities aimed at sustaining immunization programs. In many countries, 
most notably in the region of the Americas, interagency coordinating com- 
mittees have elaborated detailed financial plans that outline the commitments 
of the national governments and their donor partners over a medium-term 
period. These committees review program performance at country level 
through periodic meetings and make adjustments to plans and their funding. 
This coordination benefits both receiving countries and donor agencies by 
promoting effective use of available resources and by providing individual 
donors with the accountability and visibility needed for continuing support. 
Interagency coordinating committees, although perhaps initially formed for 
immunization programs, should ultimately have a broader mandate for 
coordination of primary health care activities in general. 

The planning process provides the opportunity for regions, governments, 
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and donors to balance global targets with regional and national priorities. 
Each region and country sets its own priorities based on the magnitude of the 
disease burden, available resources, expected outcome, and fit within the 
overall health care goals. The WHA resolutions represent the combined will 
of all the WHO member states. The WHA has stated that EPI goals and 
targets “should be pursued in ways which strengthen the development of the 
Expanded Programme on Immunization as a whole, fostering its contribution, 
in turn, to the development of the health infrastructure and of primary health 
care” (16, 17). 


ACHIEVING AND SUSTAINING FULL IMMUNIZATION 
COVERAGE 


A major goal of the EPI continues to be raising and sustaining immunization 
coverage. High levels of immunization coverage provide the foundation on 
which specific efforts at disease control can be mounted and ensure that 
disease control, once achieved, can be maintained. The WHA has urged all 
countries to continue their vigorous pursuit of providing immunization ser- 
vices for all children and women of the world. Immunization coverage levels 
of 90% for all vaccines, including tetanus toxoid in women of childbearing 
age, can be achieved in all countries by the year 2000. This will require 
increased emphasis on directing program resources to achieve and sustain 
high immunization coverage levels in all districts/municipalities and, ul- 
timately, in all communities. The separate analysis of immunization coverage 
data by district or community helps identify low coverage areas. 

Immunization schedules need to be simple, effective, and epidemiological- 
ly appropriate. The EPI Global Advisory Group periodically reviews recom- 
mendations for immunization schedules. The current schedule endorsed by 
the EPI Global Advisory Group is designed to provide protection at the 
earliest possible age (Table 1, see also Ref. 12). 

Priority activities to raise immunization coverage include: 

1. Improving the management of health services: Decentralizing 
responsibilities and providing training and supportive supervision to the 
health workers who provide immunizations. 

2. Making primary health care services more accessible: Increasing the 
frequency and range of outreach activities to extend immunization services to 
populations currently without access. 

3. Informing and motivating the public: Creating demand for immuniza- 
tion services by specifically recognizing that fathers, as well as mothers, play 
important health roles. 

4. Immunizing at every opportunity: Providing immunization services as 
frequently as feasible at all health facilities attended by women and children, 
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Table 1 Recommended immunization schedule for providing protection at the earliest possible age 








Age Vaccine 


Birth TOPV*, BCG? 

6 weeks TORY, DPT’ 

10 weeks TOPV, DPT 

14 weeks TOPV, DPT 

6-9 months Measles (high titer Edmonston-Zagreb strain of measles vaccine has been recom- 
mended at 6 months of age in countries in which measles before the age of 9 
months is a significant cause of death) 

Yellow fever (in endemic countries) 








*“TOPV = trivalent oral polio vaccine (the dose at birth or first contact is recommended in countries where 
poliomyelitis has not been controlled) 

>BCG = vaccine against tuberculosis 

“DPT = diphtheria, tetanus, and pertussis vaccine 


reviewing the immunization needs of both mother and child at the time of 
immunization of the child, and avoiding false contraindications so that im- 
munizations are not withheld unnecessarily. 

5. Reducing drop-out rates: Providing courteous services at times and 
places convenient for the users, informing parents of the importance of 
returning to complete the immunization schedule, and identifying women and 
children who are eligible for immunization and actively following up those 


who default. 

6. Using special immunization activities: Including one or more of the 
following activities in high risk areas where routine coverage remains signifi- 
cantly below average or where there is continuing transmission of disease: 
employing mass media to encourage the use of existing services; increasing 
immunization outreach activities; utilizing national, state/province, or local 
immunization days, weeks, or months (18); and performing “mopping-up” 
operations (providing oral polio vaccine to all children of an epidemiological- 
ly appropriate age group, as well as all other needed EPI vaccines to women 
and infants on a house-to-house basis). 

Priority actions to ensure sustainability of high levels of immunization 
coverage include: 

1. Coordinating donor support: Establishing interagency coordinating 
committees, thus recognizing that, in many developing countries, donor 
support of immunization activities must continue for the foreseeable future. 

2. Monitoring quality of services: Implementing such quality of service 
indicators as acceptability of immunization services provided to the communi- 
ty, appropriateness of health education messages, adequacy of cold chain, 
sterility of injection equipment, reports of adverse events following im- 
munization, and field evaluation of vaccine efficacy. 

3. Costing, budgeting, and financing: Determining costs and developing 
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budgets for immunization programs help governments mobilize internal and 
external resources. Reduction of recurrent and hard currency costs can be 
achieved through reduced vaccine wastage, vehicle whole life contracts, sale 
of solar energy, and financing alternatives that use local currency revolving 
funds for the critical recurrent cost of vaccines. 

4. Transferring technology: Producing vaccine or packaging from bulk 
may be appropriate for some countries. To transfer technology, the National 
Control Authority must be sufficiently developed to certify that the final 
vaccine product meets WHO requirements. 

5. Providing relief in situations of armed conflict: Establishing “days of 
tranquility” and “special relief corridors” for the benefit of children and 
women in situations of armed conflict that endanger sustainability of im- 
munization services. 


IMPROVING DISEASE SURVEILLANCE 


As immunization coverage levels rise, there is an increasing focus on disease 
surveillance as an indicator of program impact. Surveillance as “information 
for action” helps direct immunization activities to areas of greatest need and is 
a prerequisite to achieve the specific disease control targets. 

Surveillance for the EPI target diseases is ideally improved through 
strengthening a national surveillance system that reports only a selected 
number of high priority infectious diseases. A properly functioning system of 
surveillance includes the following: an appropriate mix of routine, sentinel, 
active, and laboratory-based surveillance activities; monthly or weekly re- 
ports submitted in a timely manner from all health units, including reports of 
zero cases; neonatal tetanus reported separately from other forms of tetanus; a 
mechanism for following up late or absent reports; and immediate reporting of 
rare, notifiable diseases of high importance, including poliomyelitis, in areas 
where this disease is close to eradication. 

Actions based on surveillance, including outbreak investigation, outbreak 
control measures, assessment of vaccine efficacy, and review of immuniza- 
tion policies and strategies, become increasingly important with higher levels 
of immunization coverage. Disease surveillance also serves as an indicator of 
program quality by identifying areas of low immunization coverage and 
detecting vaccine failure caused by inadequate vaccine quality, transport, 
storage, or administration. 

Managers responsible for disease surveillance should use indicators, such 
as timeliness and completeness of reporting and promptness in taking neces- 
Sary action, to assess progress in surveillance. Computerized surveillance 
information systems are excellent management tools for disease surveillance 
and for monitoring surveillance indicators and immunization coverage. 





GLOBAL IMMUNIZATION 231 


CONTROLLING THE TARGET DISEASES 


The measles reduction, neonatal tetanus elimination, and poliomyelitis 
eradication initiatives were formulated to strengthen a sustainable health 
infrastructure that can deliver immunization and other primary health care 
services. Many of the strategies of the disease control initiatives, such as the 
following, are common to all: raising and sustaining high levels of immuniza- 
tion coverage, improving disease surveillance, creating and maintaining pub- 
lic awareness to sustain political and financial commitment, and providing 
information and education to parents and other community members to 
increase immunization coverage and improve detection of cases. 

The following three sections provide information on the objectives, current 
status, and special strategies unique to each of these initiatives. 


Measles Reduction 


The objectives of this reduction initiative are to achieve, by 1995, a reduction 
by 90% in measles cases and 95% in measles deaths compared with pre- 
immunization levels. These morbidity and mortality reduction targets are a 
major step toward global eradication of measles in the longer run. 

Currently, 78% of infants have reportedly received measles vaccine. The 
global immunization program prevents an estimated 84 million measles cases 
and 2 million measles deaths in developing countries each year. This 
represents a 74% decrease in measles cases and a 69% decrease in measles 
deaths compared with estimates of cases and deaths that would occur annually 
in the absence of immunization programs at pre-immunization rates of dis- 
ease. The continuing significant disease burden due to measles is recognized 
by the estimated 29 million cases and 900,000 deaths that occur in developing 
countries each year (6). 

The 95% reduction in measles deaths will be achieved through such 
strategies as directing program resources to areas of highest mortality rates, 
immunizing at the most vulnerable early ages, supplementing and treating 
with vitamin A in areas of severe vitamin A deficiency, and improving 
treatment and management of complications through acute respiratory infec- 
tion and diarrheal disease control programs. Measles before the age of nine 
months continues to be a major cause of mortality in many developing 
countries. The EPI Global Advisory Group recommends that high titer 
Edmonston-Zagreb measles vaccine be administered, as it becomes available, 
at six months of age or as soon thereafter as possible in these countries. 

Measles outbreaks must be expected even in programs with relatively high 
coverage. A temporary period of low incidence usually follows accelerated 
measles control activities, but outbreaks are still likely to occur because of the 
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accumulation of susceptibles (2). Outbreaks should be analyzed to ensure that 
there is high vaccine efficacy and that immunization schedules and delivery 
strategies are epidemiologically appropriate. Such outbreaks may identify 
high risk areas that are suitable for special immunization activities and may 
provide an opportunity to secure additional resources for immunization pro- 
grams from political leaders. 


Neonatal Tetanus Elimination 


The objective of this elimination initiative is to reach a stage at which there 
are no cases of neonatal tetanus in the world by 1995. However, the term 
“elimination” recognizes that it is not feasible to remove the causative tetanus 
organism from the environment. Given the awareness of the importance of 
clean delivery practices and the availability of effective vaccines, the continu- 
ing occurrence of maternal and neonatal tetanus represents a major failure of 
public health practice. 

Currently, only 38% of pregnant women in developing countries have 
received the two-dose primary series or a booster dose of tetanus toxoid to 
protect their newborns from neonatal tetanus. The global immunization pro- 
gram prevents an estimated 700,000 neonatal tetanus deaths in developing 
countries each year. This is an estimated 55% decrease in neonatal tetanus 
deaths compared with estimates of deaths that would occur annually in the 
absence of immunization programs at pre-immunization rates of disease. The 
percentage of prevented deaths is greater than the global percentage of 
immunization coverage because coverage levels are higher than the global 
average in some larger countries that have the highest pre-immunization 
neonatal tetanus mortality rates. Neonatal tetanus, however, remains a signifi- 
cant cause of neonatal deaths, as an estimated 600,000 deaths occur each year 
(6). The 1995 target date has helped emphasize that neonatal tetanus is a 
major killer of the world’s children. 

The goal of elimination of neonatal tetanus is being pursued in ways that 
foster the development of maternal and child health services. The global plan 
of action for neonatal tetanus elimination emphasizes a twofold strategy (9): 
achieving high levels of immunization coverage in women of childbearing age 
with tetanus toxoid, and raising the proportion of clean deliveries (clean 
hands, clean delivery surface, and clean cutting and care of the umbilical 
cord). The priority for all countries in which neonatal tetanus remains endem- 
ic is to increase tetanus toxoid protection in women of childbearing age 
rapidly, especially in high risk areas. The number of neonatal tetanus deaths 
could be dramatically reduced if all women were screened and appropriately 
immunized when they brought their children for immunization and if all 
antenatal care clinics offered tetanus toxoid to their clients. 
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Poliomyelitis Eradication 


In May 1988, the WHA committed WHO to the goal of the global eradication 
of poliomyelitis by the year 2000 (17). The term “eradication” means that the 
final objective of this initiative is to reach a stage in which there is no 
circulation of wild poliovirus. In practical terms, this means that no cases of 
clinical poliomyelitis are associated with wild poliovirus and no wild poliovi- 
rus can be identified through sampling of the environment. 

A reported 84% of children in the world have received a full course of polio 
vaccine before their first birthday. More than 400,000 cases of paralytic 
poliomyelitis are prevented in developing countries each year. This represents 
an estimated 79% decrease in paralytic poliomyelitis cases, compared with 
estimates of cases that would occur annually in the absence of immunization 
programs at pre-immunization rates of disease. However, an estimated 
120,000 paralytic poliomyelitis cases still occur each year (6). 

Much has been learned from the experience of poliomyelitis eradication in 
the Americas, which began a regional eradication initiative in 1985. The 
number of cases in the Americas has been so dramatically reduced that 
transmission of wild poliovirus may be completely interrupted in the region in 
1991 (see Ref. 3). This experience has permitted the rapid development of 
poliomyelitis eradication plans of action at global, regional, and country 


levels. The global plan of action for poliomyelitis eradication emphasizes the 
following strategies in addition to those common to all of the disease control 
initiatives (10): 


. Surveillance for acute flaccid paralysis and the development of capabilities 
for outbreak control; 

. Development of a laboratory network of global, regional, and national 
reference laboratories through strengthening laboratory capabilities, in- 
cluding training of laboratory personnel, for the isolation and characteriza- 
tion of polioviruses, vaccine quality control, and environmental sur- 
veillance for the presence of wild poliovirus; 

. Improvement of poliomyelitis rehabilitation services, particularly through 
community-based programs; and 

4. Promotion of research to develop better eradication strategies, including 
improved poliomyelitis vaccines, and reliable, rapid diagnostic methods. 


The EPI Global Advisory Group recommends trivalent oral poliomyelitis 
vaccine (TOPV) as the vaccine of choice for poliomyelitis eradication. 
Routine immunization with TOPV provides individual protection for most 
recipients and, if high coverage is achieved, markedly reduces the incidence 
of acute poliomyelitis. However, in many countries, special immunization 
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activities, such as mopping-up operations with the mass administration of 
TOPV in high risk areas over a short space of time, will be required 
to displace wild poliovirus and reliably achieve eradication by the year 
2000. 

Intensified international assistance is required to eradicate poliomyelitis 
from areas in which transmission remains endemic and local resources are 
insufficient. This assistance will increasingly be recognized as beneficial to 
the international community through future savings because of the cessation 
of production, storage, and administration of poliomyelitis vaccines; the 
treatment of the disease and its complications; and the avoidance of any 
adverse reactions to immunization. 


INTRODUCTION OF NEW AND IMPROVED VACCINES 


There is a large list of additional vaccines, either in existence or under 
development, that are suitable for widespread use in developing countries. 
Finding ways to make these vaccines affordable for developing countries is a 
major challenge. The Children’s Vaccine Initiative, enunciated in the Dec- 
laration of New York in September 1990, states: “universal immunization will 
be facilitated by accelerating the application of current science to make new 
and better vaccines, benefiting children in all countries. These include vac- 
cines which: require one or two rather than multiple doses; can be given 
earlier in life; can be combined in novel ways, reducing the number of 
injections or visits required; are more heat stable . . .; are effective against a 
wide variety of diseases . . . and are affordable.” 

To date, the main cost of national immunization programs has been the 
salaries of health staff to give the vaccines, rather than the cost of the vaccines 
themselves. The cost to immunize a child fully is approximately US$5 to 
US$15; the cost of the vaccines is less than US$1. Vaccine costs will increase 
as more expensive new and improved vaccines are introduced. 

Hepatitis B vaccine is currently serving as an example of how a relatively 
expensive vaccine might be introduced. As a means of long-term control of 
hepatitis B infection, the EPI Global Advisory Group has recommended that 
all infants be immunized through complete integration of hepatitis B vaccine 
into routine childhood immunization programs (1, 13). The World Health 
Organization, UNICEF, and others in the international community are using 
the Children’s Vaccine Initiative to bring such vaccines into general use in 
developing countries. It will be a great tragedy if hepatitis B and other 
vaccines, which could have their greatest impact in developing countries, 
cannot be used in these countries because of cost. 
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SUPPORT OF OTHER PRIMARY HEALTH 
CARE PRACTICES 


Some 500 million contacts with infants and their mothers occur each year 
through immunization programs. These contacts can be used to provide other 
primary health care practices that target infants and women of childbearing 
age. One example is the assistance of immunization programs in vitamin A 
and iodine supplementation activities in areas in which deficiencies remain 
serious problems. The EPI is seeking simplified ways to implement and 
monitor the provision of such supplements through the contacts afforded by 
immunization services. 

Another example is the teaching module on birth spacing, which was 
developed jointly by the WHO Divisions of Family Health and Diarrheal and 
Acute Respiratory Disease Control and the EPI. This teaching module can be 
introduced into training courses. Other examples of EPI contributions to 
general primary heatlh care include the development of management skills, 
problem-solving approaches, logistics systems, training and survey 
methodologies, and evaluation tools suitable for primary health care use (4). 


RESEARCH AND DEVELOPMENT 


Research and development activities are a prominent feature of global im- 
munization programs. Such activities have resulted in accomplishments that 
include improved and alternative energy refrigeration equipment, cold chain 
monitors, field steam sterilizers, plastic reusable and autodestruct syringes, 
immunization coverage and missed opportunity survey methodologies, and 
successful field trials of new vaccines (7, 8, 14, 15). 

Continued research and development activities directed at solving op- 
erational problems are an important aspect of immunization programs at all 
levels. High priority areas of research include: 

1. Improved disease control strategies: Refining immunization strategies, 
developing and introducing new or improved vaccines, studying the 
acceptability of immunization, and studying the delivery of immunization- 
related services through the primary health care infrastructure. 

2. Improved methods and materials for diagnosis of the EPI target diseases 
and environmental sampling: Making full use of available technology for 
rapid and simplified diagnosis at field level and identification of wild poliovir- 
uses in the environment. 

3. Improved surveillance and program monitoring tools: Testing the ability 
of surveillance system indicators to improve routine surveillance of prevent- 
able infectious diseases; developing methods for improving surveillance for 
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acute flaccid paralysis, neonatal deaths, and rash illness; and determining 
methods of assessing the effectiveness of mopping up and outbreak response 
activities. 

4. Improved methods and materials for the cold chain and logistic support: 
Developing and testing refrigeration and injection equipment; conducting and 
refining studies and surveys on the quality of the cold chain; and investigating 
technologies and methodologies for improving logistics and transport (includ- 
ing computerized logistics management tools, vehicle maintenance, and driv- 
er safety). This research relies extensively on TECHNET, a global network of 
cold chain and logistic experts who plan and conduct such research (7). 

A complete list of priority research needs is periodically reviewed by the 
EPI Research and Development Group, which meets every six months to 
monitor progress in immunization program related research (5). 


CONCLUSION 


The EPI promotes extremely cost effective interventions. The investment in 
immunization services makes sound economical, epidemiological, and politi- 
cal sense. Disease prevention through immunization reduces not only 
deaths, but also the need for expensive curative and rehabilitative care. 
Immunization programs can contribute to building up the health infrastructure 
from which many other health interventions are more effectively and effi- 
ciently promoted. This infrastructure helps provide the health contribution to 
national development. 

Immunization, in both industrialized and developing countries, will contin- 
ue to be an important element of national health programs as new and 
improved vaccines become available. Each dollar invested will result in an 
even greater return in prevented medical care costs, disability, and death. 

To these direct benefits are added important indirect benefits: Immuniza- 
tion provides a means of helping break the vicious cycle of high infant and 
childhood mortality rates. It acts in strong synergy with family planning 
activities to reduce the total number of births to that which is desired by the 
family and safe for the mother. This reduction in births further reduces infant 
and child mortality, as well as maternal mortality. These benefits make the 
further expansion of immunization services one of the best bargains available 
for primary health care and national development. 

Immunization programs have entered into a new decade filled with exciting 
challenges. The continued commitment of the international community to 
help countries meet these challenges will move us all closer to the ultimate 
vision of a world free of suffering, disability, and death caused by vaccine- 
preventable diseases. 
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INTRODUCTION 


Paralytic poliomyelitis, once so greatly feared, is on the verge of being 
eliminated from the Western Hemisphere. A 1985 eradication program has 
helped guide a more recently launched global eradication effort. In addition, 
oral polio vaccine (OPV), given in large-scale programs, has been essential to 
this success. 

Before 1955, and the licensure of inactivated polio vaccine (IPV), 
poliomyelitis was a continuing major cause of permanent disability across the 
world. In the United States alone, more than 20,000 cases of paralytic polio 
cases were annually reported during the early 1950s (32). From 1955 to 1961, 
more than 300 million doses of the newly licensed IPV were administered, 
with a resultant decrease of 90% in the incidence of polio (Figure 1). 
However, because of the occurrence of induced polio in the spring of 1955, 
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Figure | Reported paralytic poliomyelitis in the United States, 1952-1989. 


the process of vaccine manufacture had to be changed, thus resulting in a 
vaccine of substantially lower potency than that which had been used in 
prelicensure trials (19). Not surprisingly, there was an increase in polio 
incidence during 1958 and 1959, partly because of the use of low potency 
vaccine (12). This occurrence gave added impetus to the development of a 
live, oral polio vaccine, which was introduced in 1961. The OPV was initially 
a monovalent preparation, but, within three years, a trivalent preparation was 
substituted. The formulation was based on successful programs in Canada, 
where investigators used a trivalent OPV preparation comprised, respectively, 
of 1,000,000, 100,000, and 300,000 TCIDSO of the poliovirus types 1, 2, and 
3 (36). By 1965, trivalent OPV had completely replaced the monovalent 
antecedents in the US (40). Since 1965, approximately 20 million doses of 
trivalent OPV have been administered each year in the US. Since 1968, only 
0.5% of the polio vaccine doses applied in the US have been IPV (12). 

In 1974, nearly 20 years after polio vaccine was first introduced, the World 
Health Organization (WHO) established the Expanded Programme on Im- 
munization (EPI) (42). Thereafter, several vaccines were increasingly used. 
In addition to OPV, there are vaccines against measles and tuberculosis, as 
well as the familiar diphtheria-pertussis-tetanus vaccine. 

This review describes the experience of the program coordinated by the Pan 
American Health Organization (PAHO) in the Americas. We focus on the 
OPV, which was first used to control, and then to interrupt, the indigenous 
transmission of wild poliovirus. In addition, we discuss the choice of vaccine, 
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the polio eradication initiative, the strategies for vaccine delivery, important 
problems that have been encountered, and the progress achieved thus far. 


THE POLIO ERADICATION INITIATIVE 


In 1985, 11 years after the EPI was launched, PAHO adopted the goal of polio 
eradication (8, 29). The stated objective was to interrupt the transmission of 
wild poliovirus in the Americas by the end of 1990, thereby eradicating the 
disease. Many public health experts were skeptical that this goal was realistic. 
Several factors, however, encouraged this decision. Most important was the 
situation with smallpox (9), which proved that an infectious disease could be 
eradicated. By 1985, polio incidence had decreased sharply in most countries 
(Figure 2), and the number of countries reporting cases of poliomyelitis in the 
Americas had decreased from 19 to 11 (7). Moreover, vaccine coverage levels 
for polio had reached all-time highs in many countries. 


PROGRAM STRATEGIES 


General 


The overall program called for a three-part strategy: achievement and mainte- 
nance of high immunization levels by using OPV, from the smallest geopo- 
litical level, the municipality or county, to the national level; effective 
surveillance and accurate diagnosis of all cases of acute flaccid paralysis 
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Figure 2 Rate per 100,000 population of reported paralytic poliomyelitis and OPV coverage in 
children one year of age. 
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among those individuals under 15 years of age; and area-wide vaccination 
around all new cases. During the four years since funding became available, 
this strategy, which uses OPV as the vaccine of choice, has been remarkably 
successful. 

For technical guidance and to provide recommendations crucial for pro- 
gram management, PAHO has established the EPI Technical Advisory Group 
(TAG), which is composed of five international experts. They meet every six 
to nine months to review progress and to alter, as necessary, program 
strategies. The TAG also promotes the understanding and support for program 
goals among bilateral, multilateral, and private agencies, technical in- 
stitutions, and political leaders. 

To address the financial support issues for implementation of EPI and the 
polio eradication effort, an Inter-Agency Coordinating Committee (ICC) was 
created at the regional level. The ICC has representatives from PAHO, 
UNICEF, the United States Agency for International Development, Inter- 
American Development Bank, Rotary International, and the Canadian Public 
Health Association, and the committee has been replicated in each country 
where representatives from the governments were included. The ICC has 
demonstrated that diverse organizations can work together to achieve impor- 
tant public health objectives. 


Oral Poliovirus Vaccine 


The success of OPV in the US, Canada, most European countries, and the 
USSR, made it a logical choice for use in the Americas (31). Other important 
reasons to use OPV included the substantially lower cost of OPV compared 
with IPV; the ability of OPV to induce intestinal immunity, thus facilitating 
the interruption of wild poliovirus transmission; the capacity of OPV viruses 
to spread and immunize close contacts; the demonstrated efficacy of OPV in 
controlling outbreaks; the ease of administration of OPV, a significant advan- 
tage in mass campaigns; and the potential ability of OPV viruses to displace 
the circulation of wild poliovirus in the environment (13, 18, 39). 

Investigators recognized that two factors could potentially reduce the 
effectiveness of OPV programs and that these factors needed special atten- 
tion. Oral polio vaccine is more heat sensitive than IPV and must be preserved 
at 0-8° Celsius or lower almost to the time of administration. This required 
the development and operation of a cold chain for vaccine distribution, which 
has been achieved. Also, unlike IPV, OPV can cause vaccine-associated 
paralysis. Studies show, however, that this occurs so infrequently that OPV is 
a very safe product by any pharmaceutical standards. 

The question of the vaccine of choice was explicitly addressed in 1977, and 
again in 1988, by a select committee of the Institute of Medicine of the 
National Academy of Sciences, which reaffirmed the validity of the PAHO 





POLIO ERADICATION 243 


policy (13, 14). The committee considered the polio immunization policy for 
the US to determine whether a change to routine use of IPV or IPV followed 
by OPV was warranted to reduce vaccine-associated paralysis without affect- 
ing achievements of the program. Their review showed that, during 1975- 
1986 in the US, the risk of vaccine-associated acute flaccid paralysis was 
about one case for every 2.7 million doses of OPV administered. Most cases 
were associated with the administration of the first dose (20), a risk estimated 
to be one case for every 560,000 first doses of vaccine. To assess the risks and 
benefits of each vaccine, a mathematical modeling analysis was performed to 
estimate risks and benefits over a 30-year period for two cohorts of 3.5 
million children each; one cohort would have received OPV, and the other 
IPV (12). The model assumed periodic importations of wild poliovirus, a 
coverage rate of 95%, and an efficacy of 98% for both vaccines. The model 
predicted seven times as many cases of paralytic disease if IPV, rather than 
OPV, were used. Because of these and other considerations, the committee 
recommended no change in current US policy (13). However, the committee 
did recommend that after enhanced IPV combined with diphtheria-tetanus- 
pertussis (DPT-E-IPV) was licensed, consideration should be given to a 
regimen of two or more doses of DPT-E-IPV (in place of DPT alone) 
followed by successive doses of OPV. 


Delivery of OPV 


When the eradication program began, alternative vaccine delivery strategies 
were weighed (38). A review was undertaken of the experience of countries in 
which wild poliovirus transmission appeared to have been interrupted. 

Cuba was the first country to undertake organized mass campaigns (37) and 
the first populous country to interrupt wild poliovirus transmission. Annual 
campaigns began in 1962, and shortly thereafter paralytic poliomyelitis dis- 
appeared (Figure 3). In Cuba, OPV is distributed only during two one-week 
periods each year with a two-month interval between them. During these 
periods, OPV is given to all children aged 0-10 years (more recently, 0-5 
years), irrespective of immunization status. 

Before 1980, the Brazilian Ministry of Health found it impossible to 
eliminate polio by using a distribution system that relied solely on immuniza- 
tion in the existing health services units (35). Because of the negligible impact 
on disease incidence in many states, the Ministry decided to inaugurate 
National Vaccination Days (Figure 4). As in Cuba, the Vaccination Days 
were organized twice each year with a two-month interval between them. 
During these days, every child less than five years of age was offered 
vaccination, regardless of immunization status. Similar strategies were im- 
plemented in Chile and Costa Rica where, as in Brazil, the impact on disease 
incidence was similar to that observed in Cuba (7, 16). 
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Figure 3 Poliomyelitis in Cuba, 1946-1988. 


Similar delivery strategies had been previously used by industrialized 
countries. In the early days of oral polio immunization, the US likewise held 
campaigns, which were referred to as “Polio Sundays” (32). Through the first 
three years after OPV licensure, almost all vaccine was utilized in mass 
vaccination campaigns (3, 5). Subsequently, vaccine was administered as a 
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Figure 4 Polio cases by four-week period in Brazil, 1975-1984. 
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routine service by health care providers. The transmission of indigenous wild 
poliovirus appears to have been interrupted in the early 1970s (Figure 5) (15). 
Only three outbreaks of polio have occurred in the US during the last 15 
years. All three outbreaks followed importations, and a total of 40 cases were 
reported (4). The last case caused by wild poliovirus occurred in 1979 among 
a religious sect that had refused immunization services. Since 1980, six to ten 
cases of vaccine-associated polio have occurred each year (Figure 5). 

These experiences led PAHO in 1983, two years before its decision to 
undertake eradication, to state in a position paper that National Vaccination 
Days should be an integral part of the EPI strategy and that these days should 
not be a substitute for immunizations offered during routine health care visits 
(28). The paper recognized that because of existing health care infrastructures 
in Latin America, the eradication of polio would be impossible, as would 
satisfactory levels of coverage with the other antigens. The managerial skills 
acquired in the course of these programs greatly enhanced the capacity of the 
health service staff to deal with other infectious diseases. 


Surveillance 


From the inception of the eradication program, surveillance has been a critical 
strategy for its success (7). Uniform case definitions (Table 1) were adopted 
by all countries. Surveillance indicators of program performance, especially 
those relating to the completion and timeliness of case investigations, were 
established and incorporated as an integral part of the surveillance system 
(27). By the end of 1989, after the system had been computerized (Polio 
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Figure 5 Reported paralytic polio cases (total, excluding imported cases, and vaccine- 
associated cases), in the United States, 1960-1989. 








246 DE QUADROS ET AL 


Table 1 Case definitions for paralytic poliomyelitis used by PAHO for the Americas, 1985— 
1989 








Suspected case: Any acute onset of paralysis in a person less than 15 years of age for 
any reason other than severe trauma, or paralytic illness in a person 
of any age in whom polio is suspected. This classification is tempo- 
rary, and within 48 hours the case should be reclassified as probable 
polio or discarded. 

Probable case: Suspected case with acute flaccid paralysis for which no other cause can 
be immediately identified. Within ten weeks of onset of paralysis, this 
case should be reclassified as confirmed polio or discarded. 

Confirmed case: A probable case is classified as confirmed if there is wild-type poliovi- 
rus isolated in the stool; epidemiologic linkage to a probable or con- 
firmed case; residual paralysis 60 days after onset; death; or lack of 
follow-up of a case. 





Eradication Surveillance System), analysis of data could be performed at 
various levels of the health system. The information gained from the analysis 
of these data has been used to adjust program strategies. 

Analysis of the cases throughout the Americas that occurred in 1989 [and 
subsequently repeated in 1991 (unpublished data)], suggested that some of the 
128 polio cases confirmed that year had almost certainly been erroneously 
classified as confirmed polio, particularly those cases lost to follow-up or who 
died (1). Compared with wild polio cases, patients who died or were lost to 
follow-up were likely to be more than five years of age and to be afebrile at 
the time of onset of paralysis. These patients were also less likely to have had 
adequate stool specimens taken for virus culture. To increase the specificity of 
diagnosis, a revised case definition of confirmed polio was decided upon and 
implemented in 1990: acute flaccid paralysis associated with isolation of wild 
poliovirus. A separate category, termed “compatible” polio case, included 
those patients with paralytic illness from whom no wild poliovirus was 
isolated, but who had clinically compatible residual paralysis at 60 days, had 
died, or had been lost to follow-up and from whom two adequate stool 
specimens had not been obtained within two weeks after onset of paralysis. 
The definition for vaccine-associated cases remained the same, but these 
cases were separately reported and tabulated. 


Accelerated Strategies: The Final Stages 


As polio incidence declined to low levels, another strategy, “Operation 
Mop-up,” was incorporated into the eradication program in 1989 (24, 25). 
Based on the genomic sequencing of strains of wild polioviruses, different 
wild poliovirus strains apparently occupied distinct geographic areas, and 
these areas were steadily shrinking in size (6, 34). Moreover, sustained 
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transmission, as with smallpox, appeared to require crowded, lower socioeco- 
nomic populations. Accordingly, special house-to-house campaigns were 
mounted to vaccinate all children less than five years of age who lived in areas 
considered to be at risk for transmission of disease. Areas at risk were 
determined by using information on vaccination coverage, previous occur- 
rence of polio cases, population density, and size of migrant population. This 
special intervention was expected to be a final blow for the interruption of 
transmission of virus in the few remaining loci. 


OTHER ISSUES ENCOUNTERED 


The surveillance program, which was greatly strengthened after eradication 
began, almost immediately identified problems with the vaccine. This was 
shown in outbreaks of type 3 polio in Brazil in 1986 and in Mexico in 1989. 
Before the outbreak in Brazil, the formulation of the OPV used was 10°, 10°, 
and 10°-> for types 1, 2, and 3 (a 10/1/3 ratio of amounts of types 1, 2, and 3 
components in the vaccine), as recommended by WHO (41). The choice for 
global utilization of the 10/1/3 formulation was based on the results of the 
original Canadian field trial performed in 1961 (36) and on the subsequent 
success of programs that used the 10/1/3 formulation ratio, such as those in 
Canada and the US. However, as pointed out in an extensive review of the 
subject (31), the reason for choosing such a relatively low dosage of the type 3 
component, as compared with the type | component, appears questionable, as 
the infectivity of both types 1 and 3 is almost the same (33). Investigators of 
the 1986 Brazilian outbreak noted that many of the children who contracted 
type 3 poliovirus had previously been fully immunized with OPV (30). 
Because of concern about the vaccine, a field trial was conducted to compare 
the immunogenicity of different formulations of OPV. This study revealed 
that children receiving vaccine with a 10/1/6 formulation were almost three 
times more likely to have a serological response to the type 3 component than 
children receiving vaccine with the 10/1/3 formulation. As a consequence, 
PAHO promptly recommended the use of 10°, 10°, 10°-* for types 1, 2, and 3, 
a ratio of 10/1/6 formulation, and WHO subsequently followed suit (26). 
In Mexico, the 1989 outbreak of 17 cases of polio was also caused by wild 
type 3 poliovirus, apparently because of vaccine failure. The OPV used in 
Mexico for routine vaccination is made locally, whereas that customarily used 
for campaigns is imported. Because many children in the outbreak had 
previously received more than three doses of vaccine from health service 
units, this vaccine was tested and found to be of low potency for all three 
components (unpublished data). Immediate steps were taken to assure that 
OPV with the 10/1/6 formulation was routinely used throughout Mexico. 
As noted previously, it would not be possible to replace OPV with IPV 
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alone. Other strategies propose combining OPV and IPV use (17). Presum- 
ably, this would prevent or reduce the frequency of vaccine-associated paraly- 
sis (10). Because the greatest risk of vaccine-associated polio occurs after the 
first dose of OPV (20), investigators have proposed that IPV be given as the 
first dose of a series, with OPV given subsequently to insure the desired 
intestinal immunity. 

Any change in strategy must not disrupt existing immunization schedules. 
To assure this, IPV would need to be administered coincident with DPT. To 
avert multiple injections, the antigens should be incorporated into a single 
DPT-IPV injection. Such a preparation is available commercially, but, so far, 
DPT-IPV has been cost prohibitive in the Americas. One dose each of DPT 
and OPV costs PAHO less than US$0.05, compared with more than US$0.60 
for one dose of DPT-IPV. Most countries of the Americas are not currently 
prepared to assume the burden of a tenfold increase in the cost of vaccines. 


PROGRESS OF ERADICATION 


In 1989, 128 cases of confirmed poliomyelitis were reported in the Americas, 
an 86% decline from the 930 confirmed cases reported in 1986. This decline 
occurred despite better surveillance and a twofold increase in the number of 


reported cases of acute flaccid paralysis, from about 1000 in 1985 to 2000 in 
1989. 


As the program progresses, fewer cases of acute flaccid paralysis are 
determined to be confirmed poliomyelitis. In 1990, over 50% of cases of 
acute flaccid paralysis 1990 were diagnosed as Guillain-Barré syndrome. 
Other, less frequent causes were transverse myelitis, tumors, and traumatic 
neuritis. 

The decline in the incidence of paralytic polio was also coincident with 
greatly improved OPV coverage in young children. In 1978, regional es- 
timates of coverage with three doses of OPV in one-year-old children was 
38%. In 1988, coverage estimates were greater than 70%; in 1989, estimated 
coverage reached 73%. By the end of 1990, immunization coverage for all the 
EPI vaccines had reached an all-time high: No vaccine was at less than 70%, 
and levels of 80% were recorded in several subregions, such as the English 
speaking Caribbean countries and the countries of the Southern Cone (Argen- 
tina, Chile, Paraguay, and Uruguay) (21). Although polio vaccination levels 
should be interpreted with caution, because of changes over time in the 
methodology for assessing coverage, results such as these are encouraging for 
the rest of the world. 

The 128 confirmed polio cases that occurred in 1989 were located in 99 
(0.7%) of the 14,372 counties in Latin America. Of the 128 confirmed cases, 
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24 were associated with wild poliovirus isolation (7). The cases associated 
with isolation of wild poliovirus were located in three areas: 13 (all with type 
3 isolates) in northwestern Mexico; three with type 3 and six with type 1 
isolates in the northern Andean subregion; and two with type | isolates in 
northeastern Brazil. Of the remaining 128 confirmed cases, seven were 
vaccine associated, 19 were lost to follow-up, ten died, and 60 had clinically 
compatible residual paralysis, but no poliovirus isolation. 

In 1990, there were 18 confirmed cases of poliomyelitis, a 25% decline 
from the 24 cases with wild virus isolates that occurred in 1989, and a 44% 
decline from 32 such cases in 1988. The 1990 confirmed cases were located in 
only two geographic regions: seven in western Mexico and three in neighbor- 
ing Guatemala; and eight in the northern Andean subregion in the countries of 
Colombia, Ecuador, and Peru (2, 21, 23). Poliovirus isolates from Mexico 
and Guatemala were wild type 3, and genomic sequencing indicated that they 
were genetically linked to one common ancestral focus of infection. The 
poliovirus isolates from the northern Andean subregion were all type | and, 
unlike the type 3 isolates from Mexico and Guatemala, were genetically 
unrelated. Apparently, there are separate foci of wild virus transmission in the 
northern Andean area, which will require more intensive efforts for interrup- 
tion of transmission. Of the 75 compatible cases that occurred in 1990, 21 
were lost to follow-up, 13 died, and 41 had clinically compatible residual 
paralysis. 

Thus far in 1991, there has been six confirmed polio cases with wild type 1 
isolated (onset last case April 8) in Colombia. Extensive immunization cam- 
paigns have been undertaken in all areas with cases in 1990 and 1991 and in 
other areas at risk. 

In brief, despite progressively improved reporting, as exemplified by year- 
ly increases in the number of cases of acute flaccid paralysis reported, there 
has been a rapid decline in confirmed cases of polio, reaching record lows 
each year from 1986 to 1990. By using mass campaigns with OPV as its 
primary strategy, the program is on the verge of achieving the eradication of 
polio in the Western Hemisphere. 

In July 1990, the International Commission for the Certification of Eradica- 
tion of Poliomyelitis in the Americas met for the first time to discuss certifica- 
tion procedures (22). The Commission recommended that not only would 
countries need to document the absence of wild poliovirus circulation by 
using conventional surveillance procedures, but environmental studies would 
also be needed. Encouraging results have been reported by utilizing the 
polymerase chain reaction technique for the direct detection and characteriza- 
tion of polioviruses in sewage samples collected from high risk areas in 
Brazil. This will be an important new tool for wild poliovirus surveillance in 
the Americas (unpublished data). 
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CONCLUSIONS 


Although OPV is not a perfect vaccine, it remains one of the cheapest, safest, 
and easiest of all vaccines to administer. New candidate vaccines derived 
from existing Sabin strains may potentially improve on safety and im- 
munogenicity (43). Regardless, barring extensive civil disorder or other 
unforeseen difficulties, the eradication of polio will soon be achieved in the 
Americas. Progress in the Americas precipitated the May 1988 decision by 
the 41st World Health Assembly to decide on a goal of global eradication by 
the year 2000 (11-42, 44). To that end, WHO is recommending the same 
strategies used in the Americas, including the use of OPV in mass campaigns. 

However, several critical questions remain to be answered for the global 
initiative. Although adequate for the Americas, is the current level of anti- 
genicity of OPV capable of eradicating wild poliovirus transmission in the rest 
of the world, notably Africa? Is there sufficient political and social will to 
accomplish such a task? Are the financial and technical resources available to 
develop and maintain the necessary surveillance and laboratory support sys- 
tems? How we respond to these questions will decide whether we leave our 


children the legacy of the eradication of polio. 
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INTRODUCTION 


This paper addresses issues pertinent to the health of, and health care systems 
for, college students. We describe characteristics of the college student 
population, including important subgroups of students with unique health 
problems. After briefly reviewing the history and current practice of college 
health services, we address specific health problems and current and future 
issues for college student health. 


INSTITUTIONS OF HIGHER EDUCATION 


In 1990, there were more than 3500 colleges and universities in the United 
States (49), which range in size from the smallest technical and trade schools 
to comprehensive research universities with enrollments that exceed 50,000 
students. Generalizations are difficult, because of the remarkable diversity of 
institutional morphology, which arises from variations in public or private 
governance and accountability; student population size, gender, ethnic char- 
acteristics, and residential versus commuter status; number and type of gradu- 
ate, professional, and/or research programs; and the overall financial resource 
base of the institution. 
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From a public health standpoint, these institutions may be viewed as 
complex combinations of schools and workplaces in which social, environ- 
mental, behavioral, political, economic, legal, philosophical, and cultural 
issues conspire to create unique and difficult challenges for health promotion, 
disease prevention, and medical care. In part, this is because of the tradi- 
tionally open nature of college communities. Colleges and universities are 
unlike primary and secondary schools, in which the local school district and 
parents share authority. They are also distinct from traditional workplaces, in 
which employer-employee relationships, management structures, collective 
bargaining rules, and other hierarchical processes define issues of authority, 
accountability, and responsibility. 

Post-secondary students come and go. They commonly shift geopolitical 
jurisdictions because of their education. Although they often need them, 
students typically are ineligible for public social and human services, the 
eligibility for which is usually based upon complicated residence, income, 
and working status requirements. Universities vary tremendously with respect 
to how much, if at all, they attend to their students’ nonacademic needs. 
Thus, college and university environments exist as extraordinarily complex 
social systems with nonuniform policies, unstable populations, and a wide 
range of relationships to the communities in which they are located. 


THE COLLEGE STUDENT POPULATION 


In the fall of 1988, 13,043,118 students attended colleges and universities in 
the US (50). Only 57% of these students were 24 years of age or younger, 
thus dispelling the common misperception that college students are 18-22 
years old. Nearly 30% were aged 30 years or older. Overall, 54.6% were 
female. With regard to ethnicity, 81% were non-Hispanic whites, 9% were 
blacks, 5% were Hispanics, 4% were Asian/Pacific Islands, and 1% were 
American Indian; 20% lived in school-owned housing, 50% off-campus, and 
30% with parents. Some 38% described themselves as independent. 

It is common for college health practitioners to define and characterize 
subpopulations of students (57). Grouping may be based upon preexisting 
health status or other shared characteristics on entry, or upon participation, 
while at the university, in environments associated with risk for health 
problems. Four important groups are as follows: 


Disabled Students 


Of the 12.5 million college students enrolled in the fall of 1986, 1,319,229 
(10.5%) had at least one disability (51). In 1988, 6% of full-time college 
freshmen were reported as having at least one disability, which more than 
doubles the figure for 1978 (47). Over half of these students have “hidden” 
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disabilities, such as learning disorders (27). According to Section 504 of the 
Rehabilitation Act of 1973, a student qualifies as having a disability if he or 
she “has a physical or mental impairment which substantially limits one or 
more major life activity; has a record of such impairment; or is regarded as 
having such impairment.” Common disabilities seen among college students 
include visual handicaps; deafness and hearing impairment; speech im- 
pairment; neurologic and orthopedic handicaps; chronic diseases and con- 
ditions, such as asthma, arthritis, lupus, diabetes, and cystic fibrosis; and 
chronic psychiatric disorders. 


International Students 


In 1989-1990, there were more than 385,000 international college students 
(66). The majority came from Asian nations, with China, Taiwan, Japan, and 
Korea leading the list. Latin America, Europe, the Middle East, and Africa 
accounted for 11.9%, 9.7%, 6.4%, and 4.8%, respectively, of the in- 
ternational student population. An estimated 35,000 additional students are 
enrolled in intensive English language programs, which are often attended 
before official enrollment in a college or university. Although most in- 
ternational students come to the US alone, some bring spouses and children. 
International students, who have unique ethnic and culture-specific beliefs, 
present special health needs (6). It is common to have only a few fellow 
nationals on a given campus at any one time. The sense of isolation felt by 
such students contributes to, and is often made worse by, illness and its 
concomitant dependency. 


Health Professions Students 


The health and health-related professions, such as medicine, nursing, den- 
tistry, dental hygiene, physical therapy, and many of the biologic sciences, 
account for almost 450,000 students (26). Characterized by learning environ- 
ments that require either direct patient contact or exposure to blood and 
patient tissue, such students are unique in their needs and demands for health 
services. Routine health problems found in this age group may be exaggerated 
in their incidence and importance because of heightened awareness brought 
about through study. The prevention and management of communicable 
diseases, such as tuberculosis, hepatitis B, and human immunodeficiency 
virus (HIV) infection present major challenges for student health practition- 
ers. 


Nontraditional Students 


“Nontraditional student” is a term used often and imprecisely, which general- 
ly denotes older, part-time, and working students. On some campuses, partic- 
ularly commuter campuses, they comprise more than half of all students. 
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However, one must not assume that all students over a certain age, for 
example 28 or 35, fit into this category. Many older students are full-time 
students who have left a job, the military, or some other environment to 
pursue one or more years of study, or they are graduate students in extended 
length programs. We reserve the term nontraditional student for those stu- 
dents whose primary sphere of activity is away from the campus environment. 
Depending upon their age and health status, nontraditional students may 
substantially broaden the range and complexity of health problems seen in a 
campus health center. 


HISTORICAL ASPECTS OF COLLEGE HEALTH 


The history of college health practice has been addressed in numerous pub- 
lications over the past several decades (7, 8, 33, 34, 38, 40). Some historical 
aspects of college health are of particular relevance to the field of public 
health. For example, of the many early influences on college health, physical 
activity and health education were among the most important. This was 
represented in the early 1800s, through an effort to import the mens sana in 
corpore sano model of fitness from European higher education. Coupled with 
curricula in what was popularly called “hygiene,” at Williams College in 1851 
and later in the same year at the City College of New York, students were 
educated on “the active duties of operative life, rather than those more 
particularly regarded as necessary for the pulpit, bar, or medical profession” 
(41). 

During the latter half of the 1800s, several colleges and universities opened 
health centers based upon the sentiment expressed in 1856 by President 
Stearns of Amherst who noted that “the breaking down of health of students, 
especially in the spring of the year, which is exceedingly common, involving 
the necessity of leaving college in many instances, and crippling the energies 
and destroying the prospects of not a few who remain, is in my opinion 
wholly unnecessary if proper measures could be taken to prevent it” (22). In 
1859, Amherst established a Department of Physical Education and Hygiene, 
generally regarded as the first college health service. Mount Holyoke and 
Vassar followed suit in 1861 and 1865, respectively. The health physician at 
each of these colleges had both clinical and teaching duties. The first “com- 
prehensive” student health care services were probably offered at these two 
women’s colleges. Combining medical services, infirmary care, nursing ser- 
vices, and health promotion activities, these centers carried out almost all 
aspects of current-day student health services. 

The ascendency of public health knowledge and practice from the turn of 
the century through World War I contributed to college health practice. The 
federal government turned to Dr. Thomas Storey, Professor and Director of 
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Hygiene at the City College of New York, to head an agency aimed at 
allocating federal resources for venereal disease control. Because of his view 
of the importance of university environments to the control of this problem, 
Dr. Storey ensured that some of these resources were spent to improve college 
health practice (7). After World War I, Dr. Storey’s influence on college 
health continued with his 1927 publication, The Status of Hygiene Programs 
in Institutions of Higher Education in the United States (44), which stimulated 
the development of the first set of recommended practices for college health 
centers. With an expanding economy and growth in size and number of 
institutions of higher education, almost 85% of colleges offered some sort of 
student health service by the early 1950s (34). 


COLLEGE HEALTH PRACTICE 


Approximately 1500 institutions of higher education, which enroll 80% of the 
nation’s college students, provide some form of organized student health care 
(39). Student health centers (SHCs) range in size and scope of activity from 
small, nurse-directed facilities, which provide limited nursing and health 
educational services to comprehensive health facilities that resemble multi- 
specialty group practices, some with their own Joint Commission on 
Accreditation of Healthcare Organizations-accredited hospitals. Three areas 


of emphasis predominate for SHCs: medical, psychological, and health pro- 
motion. 


Medical services range from those that address acute problems only to 
full-spectrum care, including the management of chronic disease (15). 
Facilitating access to primary medical care is a central rationale for the 
existence of SHCs. High rates of uninsurance, unfamiliarity with the local 
community resources and/or how to get to them, and lack of understanding 
about whom to see if a medical problem develops are traits common to college 
students. Resource-poor SHCs often give only advice and assistance with 
access to community providers. On large campuses, the predominant model 
of SHC medical service is a primary care setting staffed by physicians, nurse 
practitioners, physician assistants, nurses, medical assistants, and various 
supporting laboratory, pharmacy and radiologic personnel. Immunization 
clinics and family planning clinics are common. Some campuses provide 
dental services, and a few provide optometric care. 

Psychological services are an important part of college health practice. 
These services range from small campuses, which might employ a masters 
level counselor for crisis intervention and minimal, short-term counseling 
duties, to large-scale operations staffed by psychologists, psychiatrists, and 
other mental health personnel. Services might include short-term, individual 
patient counseling, extended psychotherapy, crisis intervention, rape and 
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sexual assault counseling, initiation and maintenance of psychopharmacolog- 
ic agents, group therapy, and facilitation for such groups as Alcoholics 
Anonymous and Adult Children of Alcoholics (59). 

Health promotion and health educational services are the third “mainstay” 
of traditional college health practice. Zapka & Love (65) have stated that there 
is no arena in which health educational services plays a relatively greater role 
than in college health settings. Small SHCs usually dispense health education 
through the nursing staff. In larger SHCs, departments of health education or 
health promotion exist, staffed by masters or doctoral trained health promo- 
tion or health education professionals. 

College students visit an SHC an average of two to three times during a 
school year (39). This level of utilization is somewhat lower than the 3.5 
medical visits per year for individuals aged 19-24 noted in the National 
Health Care Expenditures Survey (60). The lower average number of visits 
estimated for SHC utilization may result because students are only on campus 
part of the year, and many have conditions treated electively during the 
summer or other breaks from school. 

Although this paper concentrates on student health issues, it is important to 
recognize that some institutions extend campus health services to serve staff 
and/or faculty and occasionally student, staff, or faculty dependents. This 
becomes important when considering health education and health promotion 
programing. Smoking and alcohol policies, sexual harassment, and injury 
control are just a few areas in which comprehensive approaches aimed at the 
entire membership of the campus community are common. 

Student health centers are funded through a combination of fee-for-service, 
identified (prepaid) health fees, insurance reimbursement, and general univer- 
sity support (39). Some SHCs augment these sources through creative 
arrangements with state or local health departments, research dollars, or other 
fund-raising activities. Private colleges are more likely than public institutions 
to require proof of health insurance before entry. This is also true of health 
professions schools. 

Health services, like most other components of universities, exist as a result 
of university policy. These policies are extremely important to the day to day 
operation of health centers, as they dictate everything from health center 
resource base to hiring policies. Policies and standards, which ultimately 
govern SHC activities, vary in proportion to the heterogeneity of colleges and 
universities themselves. Even in states with centrally managed, multisite 
university systems, such as the California State University or the State 
University of New York, the actual manifestations of uniform student health 
service policies may differ. The reasons for this difference include the prox- 
imity of the campus to other medical or health resources, academic offerings 
of the campus (e.g. nursing or medical schools), local financial and 
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programatic interpretation of central policy, administrative recognicion and 
support of student health needs, and advocacy on the part of students them- 
selves for health care. 

Since 1964, the American College Health Association has offered recom- 
mended standards for SHCs to use to develop externally valid and consistent 
programs. Revised on a periodic basis, most recently in 1991, these standards 
address clinical, mental health, health promotion, environmental health, and 
support services, as well as ethical and professional issues (2). 


HEALTH PROBLEMS OF COLLEGE STUDENTS 


Only one study in the recent medical or public health literature examines the 
types of problems encountered in student health centers (19), although some 
studies do address issues in specific subpopulations (18, 63). The lack of such 
data is an important public health problem, because its absence can lead 
medical and public health professionals to the conclusion that relatively few, 
and only minor, health needs occur among college students. Lack of informa- 
tion can also lead to poor planning for health services delivery. A wide range 
of acute and chronic health problems, which represents a substantial burden of 
morbidity and mortality, does occur among college students. 

Acute health problems include genitourinary, respiratory, or gastrointestin- 
al infections. Outbreaks of vaccine-preventable diseases, such as measles, 
mumps, and rubella, continue on college campuses (61, 62). Nearly two 
thirds of sexually transmitted disease cases occur among persons under 25 
years of age (13), many among college students. Sexual assault of college 
students is common: One study suggests that one of six female college 
students were victimized by rape or attempted rape within the preceeding year 
(30). Dermatologic conditions, musculoskeletal problems, and minor trauma, 
including sprains, fractures, and lacerations, are commonly seen in student 
health centers. 

Injuries account for up to half of all deaths for those aged 10 to 24 years 
(53, 64), although with respect to college and university populations these 
statistics can be misleading. As stated earlier, only about 57% of the current 
college population fall into the “typical” 18-24 age range. Also, certain 
causes of death, such as homicide, are clearly more common in nonstudent 
groups. 

Some chronic medical problems begin as a new event in the 18-24 age 
group, whereas others carry over from childhood. Seizure disorders, migraine 
headaches, bronchial asthma and other atopic disorders, type I insulin- 
dependent diabetes, arthritis, inflammatory bowel disease, and peptic ulcer 
disease are just a few of the diseases encountered on a regular basis in student 
health facilities. Some cancers occur more frequently in college-age popu- 
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lations. Acute leukemias, Hodgkin’s disease, testicular neoplasms, and 
malignant bone tumors, such as osteogenic sarcomas, are more common in 
adolescents and young adults. More than 50% of all cases of acquired 
immunodeficiency syndrome (AIDS) are diagnosed in persons aged 25 to 39. 
A seroprevalence survey among university students reported one positive 
result per 500 students tested, or 0.2% (21). 

Student health centers serve a growing number of students with serious 
physical and psychological disabilities, such as patients with Down’s syn- 
drome, muscular dystrophy, cerebral palsy, trauma-induced neurologic def- 
icits, and cystic fibrosis. Mental health problems, including stress and 
situational reactions, anxiety and panic disorders, sexual identity and dys- 
functional problems, personality disorders, schizophrenia, and major de- 
pressive disorders, often begin during the college years. 


HEALTH RISK BEHAVIORS 


A series of behavioral, developmental, and environmental issues, which recur 
throughout the above set of health problems and concerns for college stu- 
dents, contribute to premature morbidity and mortality and reduced quality of 
life for college youth. From a public health and preventive medical perspec- 
tive, these factors may be enumerated and addressed. Although they may be 


considered separately, it is essential to understand their interrelated nature. 
Alcohol Use 


Alcohol use is the single most important public health problem for college 
students. Alcohol intoxication may be associated with up to 25% of all deaths 
in college-aged students (42). Heavy drinking episodes (five or more drinks) 
are more prevalent among college youth than their same age peers (54). Of 
injury-related deaths among persons aged 15-24, 75% are caused by motor 
vehicle accidents, and nearly half of all motor vehicle accidents involve 
alcohol (14). Besides motor vehicle accidents, alcohol abuse is closely related 
to other social and health problems of college students. On college campuses, 
alcohol consumption is related to two thirds of all violent behavior, almost 
half of all physical injuries, a third of all emotional difficulties, and 30% of all 
academic problems (25). 


Tobacco and Other Drugs 


Although the rate of daily cigarette use among college students is lower than 
among the general population (13% versus 26%), nearly one in four college 
students smokes at least one cigarette per month (54), which suggests that 
they are experimenting with the substance and are at risk of addiction. Daily 
smoking rates are estimated at 9% for men and 15% for women (54). The 
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concurrent use of tobacco and oral contraceptives among many women in this 
age group places them at higher risk of developing heart disease and cancer, 
in addition to the other negative health consequences of tobacco consumption. 

College students have an annual prevalence rate for marijuana use equal to 
their noncollege-age peers (35%), and a lower rate of daily marijuana use 
(1.8% versus 4.8%, respectively). Although other drug use among college 
students tends to be lower than among their same-age peers, the difference 
varies according to type of drug. Annual prevalence rates for any illicit drug 
other than marijuana is 19% for those enrolled in college versus 24% for high 
school graduates in the same age group (54). 


Sexual Behavior 


Reportedly, 78% of adolescent girls and 86% of adolescent boys have en- 
gaged in sexual intercourse by age 20 (52). The relationships of sexual 
behaviors to alcohol and drug use, stress, and developmental and cultural 
issues are a Gordian knot for researchers and practitioners in the field of 
college health. Sexually transmitted diseases, unintended pregnancy, and 
worry over these problems are the daily fare of college health centers. 

An assessment of the prevalence and risk factors for HIV among college 
students suggests that, although the overall prevalence of infection is low and 
confined to high-risk groups, the occurrence of behaviors that facilitate sexual 
transmission of HIV is high (31). Although college students appear to be 
knowledgeable about HIV infection, they have not adequately adopted pre- 
ventive behaviors (28). One survey of college students found that only 25% of 
men and 16% of women always used a condom during sexual intercourse 
(32). However, condom use does appear to have increased minimally among 
college students in recent years (17). 

Unintended pregnancy continues to be a serious, and often life-changing, 
problem among college women, although a review of the recent medical and 
public health literature reveals no reports of pregnancy rates specific to 
college student populations. Cumulative evidence suggests that a substantial 
proportion of sexually active college students do not use contraceptives (17, 
46). Alcohol and drug use has been associated with unprotected/unsafe sexual 
practices. A recent survey of freshman at 14 US colleges indicated that one of 
six students reported engaging in unplanned sexual activity after drinking 
alcoholic beverages (58). 


Suicide and Stress 


Suicide is the third leading cause of death among youth aged 15—24, and the 
second leading cause of death among young white men in the same age group. 
Young women attempt suicide unsuccessfully approximately three times more 
often than their male counterparts (52). The causes of suicide are multiple and 
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complex; however, substance abuse and severe stress in school or social life 
have been linked to suicide among youth (55). The college years represent a 
time of transition from adolescence to adulthood, and from more structured 
environments to independent living situations. Coping and adapting to this 
transition coincides with emotional and often psychologically traumatic ex- 
periences, as well as life-style changes that can have lifetime consequences. 


Nutrition and Physical Activity 


During the college years, adolescents and young adults develop health habits 
that put them at greater risk for the development of many chronic diseases, 
including cardiovascular disease, cancer, and osteoporosis. Dietary habits and 
physical activity are primary risk factor areas subject to change during the 
college years. Stephens et al (43) have suggested that the most dramatic 
reduction in physical activity levels occurs between the ages of 18 and 24. 
There is increasing epidemiologic evidence to support a positive relationship 
between physical activity and physical health, and a similar relationship 
apparently exists between physical activity and mental health (9). 

Diet is linked to heart disease and cancer, yet American eating habits do not 
reflect our current level of knowledge (16). The college years represent a time 
during which there are likely to be unique barriers (e.g. resources, skills, and 
facilities) that limit college students’ ability to maintain healthful eating 
habits. The intense academic and social pressures of campus life may increase 
the risk for development of an eating disorder, such as binge-eating, purging, 
and dieting (45). 


UNIQUE ISSUES FOR THE FIELD OF COLLEGE 
HEALTH 


To complete the picture of college health in this country, we address some 
final issues. 


Nonstandard Age Definitions for Adolescence and Youth 


One of the most important barriers to the development of coherent health 
programs for college-age youth is that of differing definitions of “adoles- 
cence” and “youth.” Without commonly agreed upon standards for these 
terms, it is virtually impossible to collect meaningful morbidity and mortality 
data; develop, compare, and evaluate programs aimed at addressing health 
issues of adolescents and youth; or even create appropriate policies aimed at 
health promotion, disease prevention, and medical care. Age grouping per- 
meates everything in medicine and public health, from medical practice 
arrangements to research agendas to journal publications. Some age group- 
ings for adolescence end at 17 or 18 years (48). Others extend to 24 years. For 
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example, the United Nations’ definition of “youth” or “young people” encom- 
passes the age limits 15 to 24 years (4). Similarly, the World Health Orga- 
nization’s definition of adolescence has raised the upper age limit to 24 years, 
or about the time of total socioeconomic independence (4). 

Three recent reports on adolescent health have avoided addressing the 
health issues of college-age youth. The Congressional Office of Technology 
Assessment’s April 1991 report on adolescents limited its scope to those aged 
10 through 18 years (48). The American Medical Association acknowledged 
the importance of barriers to health care access faced by those aged 19 to 24, 
but excluded them from its report (20). Finally, preliminary data from the 
National Center for Health Statistics on the health care utilization patterns of 
adolescents covers only those aged 11 to 20 years (35). It is difficult not to 
conjecture that the reason young adults were ignored in these reports is that 
unique data for them is sparse and confusing. In an environment in which 
information on adolescents and young adults is either not collected at all, or 
collected in nonstandard ways, it can easily appear that few problems exist. 


Responsibility, Accountability, and Perceptual Issues 


One of the largest “cracks” in the way our society handles health problems is 
that confronted by adolescents and young adults as they transit from the 
sphere of authority and responsibility of their family-of-origin and move into 
that of their own family and workplace. Who is responsible for the health of 
the 22-year-old emancipated college student with a part-time job in the service 
industry: the student, his/her parents, the college, the student’s employer, the 
community in which the student lives, or some combination of these? 

The “structure” of our health care system does not yield an answer to this 
question. Our discipline-bound perspectives in public health and medicine 
only confound the issue. Organized medicine has overlooked the college 
student population in the past, probably because of the limited economic 
incentives in such a traditionally “healthy” group. This has contributed to the 
rising concern over the competency of health care professionals to meet the 
health needs of young people (5). School health, a traditional area of public 
health practice, is almost always considered to address only those issues 
relevant to preschool through 12th grade students. Public health practice, on 
the other hand, tends to focus upon defined disadvantaged and underserved 
populations in governmental jurisdictions. College students are not included 
when planning these services, even in the face of profound shifts in their 
social and demographic characteristics. 

Students covered by their parents’ insurance policies are usually only 
eligible through age 22 or 23, and many lack insurance (37). A recent survey 
in California found that up to 30% of students had no medical insurance (10). 
Experience on our campus suggests that another 30% have only partial health 
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insurance coverage. Temporary status in low-skilled labor positions does not 
provide insurance for self-supporting students. Also, even though most 
adolescents and young adults do not incur great expenses for health care 
during any given year, average expenditure data can be misleading. One study 
found that 10% of adolescents with the highest expenses accounted for 65% of 
all out-of-pocket expenses (36). Given that college is now commonly a five- 
to seven-year undertaking, with variable amounts of time “off” to either join 
the temporary workplace or to pursue individual interests, questions of 
responsibility are very complicated indeed. 


DIRECTIONS FOR THE FUTURE OF COLLEGE HEALTH 


Several current, anticipated, and necessary developments are likely to shape 
the future of college health theory and practice. 


Healthy People 2000 


Healthy People 2000: National Health Promotion and Disease Prevention 
Objectives for the Year 2000 specifically addresses college student health as 
follows: “Increase to at least 50% the proportion of postsecondary institutions 
with institution-wide health promotion programs for students, faculty, and 
staff’ (52). Postsecondary institutions, including two- and four-year commu- 
nity colleges, private colleges, universities, and trade and technical schools, 
have been identified as settings in which many 18- to 24-year-olds can be 
reached. Currently, there are no reliable national estimates of the proportion 
of postsecondary schools that offer institution-wide health promotion pro- 
grams. A survey of 3000 postsecondary institutions conducted by the Amer- 
ican College Health Association in 1989-1990 suggests that at least 20% of 
the institutions surveyed offered health promotion activities for students (1). It 
is encouraging that Healthy People 2000 recognizes young people as a special 
population that, in many cases, experiences higher rates of morbidity, disabil- 
ity, and mortality than the general population (3). 


Comprehensive College Health and the Integration of School 
Health, College Health, Worksite Health, and Public Health 
Promotion 


A comprehensive approach to college health requires the integration of pro- 
grams and services similar to that which is now advocated for school health. 
College communities share many characteristics with K-12 schools. Tradi- 
tional school health, including only health instruction and clinical health 
services, is expanding to incorporate five additional areas: integrated school 
and community health promotion efforts, physical education, food service, 
counseling, and health promotion programs for faculty and staff (29). College 
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health practice is likely to expand similarly. A framework for the develop- 
ment of campus-based health programs would include environmental, bio- 
medicai, behavioral, and organizational interventions (23). However, as we 
noted earlier, the unique, independent, and often balkanized nature of college 
campuses will make such logical and coherent approaches difficult. 

The first step should be measurable success in community health pro- 
motion—combined educational, social, and environmental actions aimed at a 
population in a geographically defined area (23). In college communities, 
these actions may be directed at high-risk students, special interest groups, 
faculty and staff, and/or the entire campus community. Models of com- 
prehensive college health must be developed and tested. Because the college 
community plays an essential role in the day-to-day lives of students and 
because it is oftentimes the only stable environment to which a college student 
relates, the college community is uniquely situated to accomplish this. 

To complement the development and implementation of comprehensive 
college health practice, it is essential to coordinate and articulate college 
health promotion, disease prevention, and medical service activities with 
similar activities in schools, worksites, and the public health sector. Out- 
comes desired from each enterprise are the same. We are a long way, 
however, from a common vision, which unifies theory or clear policies that 
provide for meaningful working relationships among all of these sectors. 


Increased Recognition and Understanding of College Student 
Health Issues 


The most critical step in attaining appropriate recognition for college student 
health needs is the development of a common language for data relevant to 
adolescents and young adults. This must be a joint undertaking of representa- 
tives from the college health community; representatives from adolescent 
health, public health, pediatrics, internal medicine, family practice, school 
health, psychiatry, psychology, nursing, other health care sciences; and 
representatives from governmental health, education, and welfare agencies. A 
set of agreed-upon age groupings, definitions, and terms must be developed 
so that uniform data on morbidity, mortality, health, social, and economic 
status of adolescents and young adults can be collected, aggregated, and 
reported. 

The development of a common language will facilitate research into the 
determinants of health and illness in college students, including the creation 
and maintenance of surveillance systems aimed at tracking important health 
risk behaviors. This should become standard practice for public health pro- 
fessionals. 

Finally, research into the relationships between health and academic per- 
formance is needed. In one study of a university that had an 8.5% overall 
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attrition rate, and a 25% loss of freshmen after the first year, health-related 
problems were found to be a leading cause of school drop-out (12). At the 
University of California, Berkeley, over 25% of the students who withdraw 
list health as a reason for doing so (56). These reports notwithstanding, there 
is a dearth of quality research on the relationships between health status, 
academic performance, undergraduate or graduate education completion 
rates, and ultimate career success. 


CONCLUSION 


Public health professionals should become familiar with the unique health 
problems of college students and the potential that college communities have 
as environments for health promotion and disease prevention. In addition, the 
question of responsibility for the health of college students must be addressed, 
as they are among the most likely groups to be uninsured. President Nils 
Hasselmo of the University of Minnesota has proposed a seventh principle to 
be added to six principles for campus life, which was recently published by 
the Carnegie Foundation for the Advancement of Teaching (11): “A college or 
university is a healthy community, one in which personal and public health is 
an accepted institutional commitment, backed by policies and programs that 
apply the knowledge we have acquired” (24). This statement is an extremely 


productive starting point. However, continued dialogue must occur among 
representatives of higher education, public health, the medical community, 
local, state, and federal government, and other sectors of society with a stake 
in the health of youth. 
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INTRODUCTION 


The incidence rates and patterns of certain diseases and conditions in Amer- 
ican Indians and Alaska Natives often prove useful in understanding the 
nature of these conditions (5, 37, 42, 45, 52). This may also be true for infant 
mortality. In this review, we examine infant mortality data for American 
Indians and Alaska Natives and compare these data with those for the general 
US population. In some instances, comparisons between Indian groups are 
made. Because excellent general discussions of infant mortality are available 
(7, 28, 29, 53), in most instances we cite only references to American Indians 
and Alaska Natives. For convenience, we use the term “Indian” to denote 
those persons commonly identified as American Indians and Alaska Natives. 
The Indian Health Service (IHS), its relationship with the various tribes, the 
sovereign nature of the tribes, and the changing health conditions of Indians 
have recently been described (16, 38-40). 
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DIFFICULTIES IN CALCULATING INDIAN DEATH 
AND DISEASE RATES 


Confounding factors that accompany enumerations of vital events relating to 
Indians dictate caution in drawing conclusions. For example, the heterogene- 
ity and complexity of tribal groups and organizations, and the small size of 
many tribes, make accurate data collection difficult and expensive and often 
limit general conclusions. On the other hand, many differences are suf- 
ficiently large that errors of sampling are less important. For example, death 
rates for certain conditions are severalfold greater for Indians than for the 
general population (18); thus, conclusions can be made with sufficient confi- 
dence to permit program planning and design. When intergroup differences 
are not so great, however, considerable caution is advised and special care is 
required to avoid erroneous conclusions. To compensate for the small num- 
bers in many communities, the IHS averages vital event rates for three years 
centered in the year being studied. Other problems arise from identification 
and reporting variations, which in some instances are quite large. Trends 
in Indian Health (18), an annual publication, presents the sources of IHS 
vital event and health data and discusses the care to be taken in their inter- 
pretation. 

To establish baseline denominator numbers, estimates of the total US 
Indian population are made during the decennial censuses, which enumerate 
individuals who identify their race as American Indian, Eskimo, or Aleut. 
From these data, and from annual birth and death counts reported on state 
vital records, the IHS projects the number of Indians who are eligible for IHS 
services. For this purpose, the IHS utilizes the number of self-identified 
Indians who reside in counties “on or near” federal Indian reservations in the 
33 “reservation states” in which the IHS has health care responsibilities. 
These counties make up the IHS service area and contain the Indian popula- 
tion thought to most nearly approximate the population that utilizes IHS 
services. This IHS service area population differs from both statewide 
(reservation states) and national Indian populations, and failure to distinguish 
between these populations can lead to confusion. The number of both counties 
and reservation states continually changes as Indian tribes receive federal 
recognition and as tribes add members to their rolls. As a result of such 
changes, Connecticut, Rhode Island, and Texas in 1983, and Alabama in 
1984, were designated as reservation states. Between 1980 and 1990, the IHS 
added 58 counties to the IHS service area. Of the 3150 counties and in- 
dependent cities in the United States, 505, or 16%, now make up the IHS 
service area. As of October 1989, there were an estimated 1,780,000 Indians 
residing in the United States. Of these, 1,642,000, or 92%, resided in the 
reservation states and 1,105,000, or 62.1%, resided in the IHS service area. 
As a result, there are more than 500,000 Indians who may be included in 
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tabulations of state or national data, but are not included in analyses of those 
living in the IHS service area. Also, estimates include all persons reported as 
American Indian and Alaska Native, not just those eligible for IHS services 
by virtue of belonging to tribes that have federal recognition. Only statewide 
vital event data are available before 1972, so the IHS utilizes this data base to 
project trends back to 1955. 

The accuracy of the decennial census in enumerating certain groups has 
often been questioned. Passel (33) found that the 1970 census estimates of the 
number of American Indians under 20 years of age could have been as much 
as 6.9% too low. However, for all ages there was a net “overcount” by as 
much as 67,000 between the 1960 and 1970 censuses. Some of this variation 
arose from differences in racial self-identification, which suggests that many 
Indians were reported as white on their birth certificates and on the 1960 
census, but their designation was changed to Indian during the 1960s. In 
addition, racial designation was completed by enumerators in most rural areas 
of the US during the 1960 census, whereas self-identification was used to 
report race during the 1970 census. Recent trends suggest that the number of 
persons identifying themselves as Indian is increasing, and the 1990 census is 
expected to add a substantial number of newly identified Indians. This will 
increase the denominator, thus adding another confounding factor to an 
already difficult field of study. 

For most routine vital event analyses, the IHS utilizes data for Indians 
compiled by the National Center for Health Statistics (NCHS) from state birth 
and death records (18). Norris & Shipley (31), who examined California data, 
found considerable variation in the identification of race between the birth and 
death certificates of Indian infants. Of 148 Indian infants identified by race of 
either parent on birth records, 90, or 60.8%, were coded as white on the death 
certificate. Use of the birth cohort method caused an increase in the calculated 
rate for Indians from 13.9/1000 to 29/1000. Staub and coworkers (46), who 
examined data for the Cattaraugus Reservation in New York, found that 
Indian children were often registered at birth as Caucasian and that the 
recorded number of Indian births for the reservation might actually be falsely 
low, which possibly caused a falsely high mortality rate. Examination of 
every tenth state birth record of a group of 84 Cattaraugus children disclosed 
that of the nine records studied, five newborns were recorded as American 
Indian and two as white; no record could be found for the other two. Similar 
results were reported for Indians in Washington State by Frost & Shy (6) and 
in Oklahoma by Kennedy & Deapen (20). 

In summary, Indians are a heterogeneous population that is organized into 
small groups; live in arbitrarily defined geopolitical areas; and comprise 
several subsets of populations that identify themselves, or are variously 
identified, as Indian. The lack of congruence between the several population 
subsets, including those receiving services from the IHS, is not surprising. 
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These reporting and identification variations influence the size of both the 
numerator and the denominator, with obvious effects on calculated vital event 
rates. 


EFFORTS TO IMPROVE CALCULATIONS 
OF INFANT MORTALITY 


To provide more detailed characteristics of infant deaths and more accurate 
estimates of infant mortality rates among ethnic minorities, the NCHS began 
to match birth and infant death files, beginning with the national 1983 birth 
cohort. Kleinman (21), who used 1983 and 1984 cohort data, found that 
differences in classification of race on birth and death certificates were less 
than 2% for whites and blacks, but 25-40% for Indians and Asians. By 
calculating infant mortality rates according to race of mother, he found that, 
throughout the US, Indians had a mortality rate of 14.3/1000 and a relatively 
low neonatal mortality rate of 6.9/1000; but, they had the highest postneonatal 
mortality rate (7.4/1000) of all racial groups studied. Handler & Macken (9) 
reviewed the race of parents on reservation state birth certificates of newborns 
who had at least one parent reported as Indian for calendar years 1972-1987. 
They calculated the 1987 Indian infant mortality rate for reservation states to 
be 12.3/1000 if only the mother’s race were used to identify Indian births, 
compared with 9.8/1000 if both parents’ race were used. Because the latter 


cohort group most nearly resembles the IHS service area population, the IHS 
proposes to use linked data, when available, thus considering the identifica- 
tion of race of both parents for most program purposes. In the future, the IHS 
intends to publish Indian infant mortality data by using two criteria for 
identifying Indian births: race of the mother only (the NCHS criteria) and race 
of either parent reported as Indian. 


INDIAN INFANT MORTALITY 


Table | shows the aggregate infant mortality rates for Indians within the IHS 
service area for 1976 to 1987 from unlinked death records. Each year’s rates 
represent a three-year average centered on the year shown. In 1976, the Indian 
infant mortality rate was 20.1/1000 live births. By 1987, it had fallen to 
11.1/1000, a decrease of 45%. During this interval, the infant mortality rate 
for US all races declined from 15.2/1000 to 10.1/1000, a decrease of 34%. 
The Indian neonatal death rate fell from 9.8/1000 to 5.1/1000, a decrease of 
48%; the postneonatal death rate fell from 10.3/1000 to 6.0/1000, a decrease 
of 42%. 

Figure | illustrates the leading causes of Indian infant deaths, compared 
with US all races. The first six leading causes of Indian infant deaths under 
one year of age are the same as those experienced by US all races, although in 
a slightly different order of ranking. 
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SOURCE: Division of Program Statistics, OPEL, IHS 


Figure 1 Leading causes of Indian infant deaths under one year of age compared with US all 
races, 1987 (unlinked data from “Reservation States”). 


The leading causes of Indian neonatal deaths are congenital anomalies, 
disorders relating to short gestation and low birthweight, respiratory distress 
syndrome, sudden infant death syndrome (SIDS), effects of maternal com- 
plications of pregnancy, and infections specific to the perinatal period (Figure 
2). Although there are differences between Indians and US all races in the 
rates of each cause, the rank order is the same, except for the higher ranking 
of SIDS for Indians. The distribution of leading causes of postneonatal deaths 
is quite different between the Indian and the US all races population, as shown 
in Figure 3. The significance of SIDS for both Indians and the US all races is 
striking, followed by congenital anomalies, accidents and adverse effects, 
pneumonia and influenza, meningitis, and septicemia. The conditions in 
which there is an “excess” of Indian deaths compared with the US all races are 
SIDS, accidents and adverse effects, pneumonia and influenza, and meningi- 
tis. In 1987, SIDS was responsible for more than 39% of all postneonatal 
Indian infant deaths, and the Indian rate was 1.5 times than that for US all 
races. 


INFANT MORTALITY RATES BY IHS AREA 


Not surprisingly, considerable variation in infant mortality rates exists be- 
tween IHS areas (17), with a low of 8.3/1000 in the Albuquerque area to a 
high of 19.8/1000 in Aberdeen (Table 2). In California and Oklahoma, where 
reporting differences are known to be large, the rates are considered unreli- 
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Figure 2 Leading causes of Indian deaths under 28 days compared with US all races, 1987 
(unlinked data from “Reservation States”). 


ably low. The differences between the low rates in the Southwest, compared 
with the Central Plains and Northwest, are striking. For example, the 1987 
average infant mortality rate of the Albuquerque, Phoenix, Tucson, and 
Navajo areas was 11.0/1000, compared with an average of 17.2/1000 for 
Billings and Aberdeen. The mortality rate of the latter two areas even exceeds 
that of 14.7/1000 for the Alaska area. 

Similar differences exist in neonatal death rates: The highest is 7.9/1000 in 
the Aberdeen area, compared with 4.6/1000 in the Navajo area. In 1987, the 
average postneonatal death rate for the Navajo, Albuquerque, Tucson, and 
Phoenix areas was 5.3/1000, compared with 10.5/1000 for Aberdeen and 
Billings. Thus, the excess mortality experienced in the Northern Plains is 
especially marked in the postneonatal period, and the single most important 
cause is SIDS. Again, neonatal and postneonatal data for California and 
Oklahoma are considered unreliable. 

The infant mortality experience for Navajos has not always been favorable. 
In 1970, Brenner and coworkers (3) reported that the infant mortality rate for 
Navajos born on the reservation was 31.5/1000. Almost one-half occurred in 
the neonatal period. These authors found that hospital records were superior 
to death certificates for identifying the cause of death. They state that “there 
were numerous instances in which the primary cause of death listed on the 
death certificate was grossly inconsistent with hospital chart observations.” In 
1969, Van Duzen and coworkers (49) reported that 616 children were suffer- 
ing from malnutrition at the Tuba City, Arizona, facility. Of these children, 
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Figure 3 Leading causes of Indian deaths 28 days to under one year compared with US all 
races, 1987 (unlinked data from “Reservation States”). 


15 were diagnosed as having kwashiorkor, and 29 had marasmus. The authors 
concluded that malnutrition contributed to the mortality of these Navajo 


children. 

In recent years, the development of a strong maternal and child health 
(MCH) program has greatly increased the accuracy of reporting Navajo infant 
deaths. Ross (41) found that both prenatal and postpartum visits increased 
after the 1969 institution of a nurse midwife program. This program seemed 
to be associated with a decrease in length of hospitalization and improved 
infant mortality rates. 


FACTORS CONTRIBUTING TO INDIAN INFANT 
MORTALITY 


Historical Review 


In 1970, Hill & Spector (11) found that about 8% of liveborn Indian infants 
weighed 2500 g or less, compared with 7% for whites and 14% for nonwhites. 
The 1967 infant mortality rate was 32.2/1000 for Indians, 19.7 for whites, 
and 35.9 for nonwhites. The authors reported that the infant mortality of 
Indians had declined nearly 50% since 1955, compared with a decline of 16% 
for both whites and nonwhites. Neonatal death rates of Indians were about the 
same as for whites, but Indian postneonatal death rates were four times higher 
than that for whites. Wallace (51) reported that the 1967 Indian infant 
mortality rate was 34.5% higher, and the postneonatal rate three times 
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higher, than that of the US general population. By using data from the 
National Infant Mortality Surveillance, Vanlandingham and coworkers (50) 
compared relative risks of mortality between Native Americans and white 
infants in Alaska, Arizona, Montana, New Mexico, North Dakota, and South 
Dakota, where the major proportion of nonwhite, nonblack infants were 
Native American. The infant mortality rate among Native Americans was 
15.3/1000 live births, compared with 8.7/1000 among whites. These authors 
confirm the variations that exist in different surveys and reporting systems. 
They also confirm the importance of infections and SIDS in postneonatal 
deaths of Indians. Hirschhorn & Spivey (12), who examined data from 1965 
through 1971, found that the infant mortality rate of White Mountain Apaches 
was 76/1000 live births, compared with 32/1000 for all Indians and 22/1000 
for US all races. The proportion of low birthweight for Apaches was approx- 
imately 10%, compared with 8% for the general population. Given the low 
socioeconomic conditions prevailing in many Indian communities, one might 
expect a higher infant mortality rate than has been found by either the IHS or 
the NCHS. In 1983, Sullivan & Beeman (47) reported that Indians in Arizona 
experienced less prenatal care, higher incidence of newborn problems, a 
higher rate of communication problems with providers, and considerably less 
satisfaction with the care received. Honigfeld & Kaplan (14) pointed out that 
several factors contribute to the excess Indian postneonatal deaths, and es- 
sentially all of them related to the low socioeconomic conditions in which 
many Indians live. The authors recommended that programs to lower post- 
neonatal mortality should focus on promoting prompt health care and prevent- 
ing accidents and other postneonatal health problems. However, Nutting et al 
(32), who studied the effect of the establishment of a MCH program on the 
Papago reservation in Arizona, found that the greatest effect of the program 
was on the group “who sought and received reasonably good care prior to the 
program” and that the group most at risk did not derive the expected benefits 
of the program. 


Low Birthweight and Prematurity 


Comparisons of low birthweight between Indians and US all races are shown 
in Figure 4. Of births to Indians under the age of 15, 7.4% are low birth- 
weight, compared with nearly 14% of births to the same age group for US all 
races. This predisposition to larger babies continues until age 25-29, during 
which the occurrence of low birthweight babies is almost the same for both 
groups. The ratios are reversed after age 30-34, when US all races mothers 
slightly exceed Indian women in producing “normal” weight newborns. This 
difference is offset by the greater number of Indian births in mothers under 
age 24. Of Indian births in reservation states in 1987, 19% occurred among 
women under age 20, compared with only 12.4% of the US all races births in 
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Figure 4 Incidence of low birthweight (<2500 g) Indian newborns as a percent of total live 
births, by age of mother, compared with US all races, 1987 (unlinked data from “Reservation 
States”). 


this age group (18). This difference was not present in the linked birth/death 
calculations of Kleinman (21), who found that Indians had a low birthweight 
rate of 62/1000 live births, compared with 56/1000 for whites and 127/1000 
for blacks. 

Iba et al (15) confirmed the importance of genetic and environmental 
factors in low birthweight babies; when these factors were considered, lack of 
prenatal care was significantly associated with low birthweights and newborn 
deaths. Adams and coworkers (1) identified two different populations of low 
birthweight Indian infants: those who were part of a “normal” distribution and 
those who belonged to a group that they termed “deviant” low birthweight. 
They pointed out sharp tribal differences in the makeup of each group. 


Congenital Anomalies 


Indian Health Service data suggest that congenital anomalies play a slightly 
smaller role in Indian infant mortality than in US all races infant mortality. 
However, Lynberg & Khoury (26), who used linked birth/infant death data, 
found that infant deaths associated with birth defects were highest for Amer- 
ican Indians (2.9/1000), compared with Asians and Hispanics (2.6/1000) and 
blacks (2.5/1000). American Indians also had the highest incidence rate of 
major defects: 22.0/1000 compared with 18.0/1000 for blacks and 19.0/1000 
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for whites. Lowry and coworkers (25), on the other hand, found a lower 
frequency of congenital defects in British Columbia Indians than in the 
general population. Niswander and coworkers (30) reported that Indians 
experienced more cleft lip and palate and polydactyly and a lower frequency 
of clubfoot and central nervous system malformations compared with Cauca- 
sians. They also found considerable variation in the rates of congenital 
malformations between Indian groups. 

Although data are incomplete, there is a strong suggestion that fetal alcohol 
syndrome (FAS) contributes to infant mortality (24, 48). This syndrome, 
which is defined as a pattern of mental retardation, facial deformities, and 
growth failure that occurs in fetuses and infants who were exposed to alcohol 
in utero, results in a higher than expected number of fetal deaths during 
pregnancies of alcohol-abusing women (35). Estimates of the incidence of 
FAS in the general population vary greatly, ranging from one to three cases 
per 1000 live births (35). Among Indians of the Southwest, the incidence is 
reported to vary from 1.3/1000 to 10.3/1000 live births (27). Fetal alcohol 
syndrome is an important issue because it is currently the most common 
preventable cause of congenital anomalies. 


Sudden Infant Death Syndrome 


The cause of this prominent contributor to infant mortality, especially in the 


postneonatal period, is not known, and many aspects are incompletely un- 
derstood (43). Among Indians, SIDS causes 40% of postneonatal deaths and 
may currently be considered the most significant cause of Indian infant 
mortality. In Kleinman’s (21) linked studies, the SIDS rate for Indians was 
nearly three times higher than for whites. SIDS is also differentially distrib- 
uted among Indian populations, as it is especially prevalent in the northern 
parts of the US. Recent suggestions of a possible relationship to tobacco 
smoking (2, 4, 8, 13, 22) are of special interest in view of the known high 
rates of smoking in the same IHS areas where SIDS is most prevalent (23, 34, 
44). Shannon & Kelly (43) describe several epidemiologic factors that in- 
crease the risk of SIDS: “If the mother is less than 20 years of age, unmarried, 
poor, if she has delayed or failed to seek prenatal care, had a short interval 
between pregnancies, been ill during pregnancy, or had previous fetal loss, or 
if she has smoked cigarettes, or abused narcotics. The risk is also increased if 
the father is less than 20 years old or of low social or economic level. . . . In 
the United States, where the overall risk is about 2.0 cases per thousand, 
Asians have the lowest risk, and American Indians, Alaska Natives, and poor 
Blacks have the highest.” Kraus and coworkers (23) found that variables with 
a high degree of discriminatory power for sudden unexplained death were age 
of the mother, total number of children born alive, birthweight, multiple 
birth, duration of prenatal care, and sex of the infant. For Indians, the most 
important discriminating variable was birthweight, but for whites it was the 
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age of the mother. Kaplan and coworkers (19) were unable to confirm a 
significantly greater incidence of SIDS among Oklahoma Indians compared 
with the non-Indian population. 


Injuries 


In 1987, accidents constituted the fourth leading cause of death among 
Indians under one year of age in the reservation states. The death rate 
(0.4/1000) is twice that for US all races infants, with differences especially 
noticeable in the postneonatal period. Indian male infants are more likely to 
die of injuries than are Indian females, a situation somewhat at variance with 
that for US all races infants, among whom gender differences are less 
striking. The mortality rate of Indian male infants associated with motor 
vehicle injuries is more than four times greater than that for US all races male 
infants (22.6/100,000 compared with 5/100,000) (18). 


Homicide 


In 1987, the homicide rate for those less than one year of age was 12.2/ 
100,000, with males affected approximately three times more often than 
females (18.8/100,000 compared with 5.6/100,000, respectively). Compara- 
ble rates for US all races (both sexes) were 7.4/100,000 (males, 7.9; females, 


6.4/100,000). The rate of homicides among male Indian infants is more than 
two times greater than that for US all races males (18). The estimates of 
Indian infant death rates from injuries and homicide would undoubtedly be 
higher if linked birth/death records were employed. 


IHS MATERNAL AND CHILD HEALTH PROGRAM 


From its inception in 1955, the IHS has emphasized attention to maternal and 
child health as the most efficient way to deal with general Indian health. The 
specific goal is to provide a wide range of health promotion services that 
relate to childbearing and to the female reproductive cycle, including family 
planning, in comprehensive, community-oriented programs carried out 
through both the general program and certain special MCH activities. Pro- 
grams include well-child surveillance, special programs for the de- 
velopmentally disabled, and services for chronically and acutely ill patients. 
Emphasis is placed on early prenatal and postpartum care. Health education 
and prevention activities, such as immunizations, are stressed, and immuniza- 
tion levels of Indian children regularly exceed those for the general popula- 
tion. 

In 1991, the IHS will expend an estimated $218,000,000 for all services 
related to maternal and child care, including prevention of infant mortality. 
This is approximately 17% of the $1,275,000,000 appropriated for IHS 
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clinical services and includes $2,100,000 for special projects aimed at defin- 
ing the problem of FAS and implementing programs to decrease its incidence. 
Another special program is the utilization of bacterial polysaccharide immune 
globulin combined with vaccine to control the unusually high prevalence of 
invasive Hemophilus influenzae type b infections in certain Alaskan and 
southwestern United States communities. 

In December 1989, in response to amendments to PL 94-437, the Indian 
Health Care Improvement Act (36), the IHS developed a plan to further 
reduce infant mortality (G. Brenneman and J. Lyle 1989, unpublished data). 
This multidisciplinary plan is being updated and will focus on the objectives 
in Healthy People 2000 (10). The plan places primary responsibility in each 
IHS area office to strengthen existing MCH programs, cooperate with the 
Bureau of Indian Affairs in efforts to deal with FAS, and reduce the number 
of teen pregnancies. The beneficial collaboration with the Indian Health 
Committees of the American Academy of Pediatrics and the American Col- 
lege of Obstetrics and Gynecology will be continued. Each IHS area office 
will establish multidisciplinary and interagency infant and maternal mortality 
review teams. Information and recommendations from the American 
Academy of Pediatrics Postneonatal Infant Mortality Project will be in- 
corporated into the program as appropriate. Because of both the importance of 
injuries as a cause of death among Indians and advances in injury prevention, 
the IHS formalized a Community Injury Prevention program in the early 
1980s, which is located in the IHS Office of Environmental Health and 
Engineering. A full-time coordinator was appointed with authority to oversee 
the development of multidisciplinary approaches to decrease the number of 
injuries among American Indians. Several elements of this program have 
already been successful. In 1990, approximately 30,000 students participated 
in an annual safety poster contest. In 1987, a unique training fellowship, in 
conjunction with Yale University, was established to provide training for 
selected field employees. This training includes special epidemiologic pro- 
jects and the design of prevention programs, which are then implemented by 
the fellows. A memorandum of agreement with the Centers for Disease 
Control provides funding for competitive grants to support innovative com- 
munity injury control programs. An active education program is coupled with 
distribution of car restraints for infants in virtually all service units. 

The improvement in infant mortality since 1955 represents a major achieve- 
ment and remains impressive, regardless of the techniques used to calculate 
infant mortality (9, 18). Unfortunately, the previously steady downward trend 
in infant mortality slowed by 1981 and, by 1983, had leveled off. An 
important unanswered question is whether there is a difference in the infant 
mortality rates of Indians who receive care in the IHS compared with those 
who do not. Such a comparison would be extremely difficult and costly with 
present data systems. 
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SUMMARY 


Accurate determination of infant mortality rates among Indians is seriously 
hampered by variations in the identification of Indian persons and use of 
different subsets of the Indian population for various purposes. Lack of 
consistency in the reporting of racial origin on birth and death records is a 
source of substantial error. Because of these factors, more than the usual care 
must attend comparisons and inferences drawn from data in which these 
differences are present. At present, it would seem prudent to regard all data 
about American Indians as provisional. Even though Indian infant mortality 
remains higher than that for US all races, regardless of techniques used for 
estimates, the decline of Indian infant mortality by more than 80% since the 
establishment of the IHS is a truly remarkable achievement. This success has 
been ascribed to a combination of activities, including the provision of safe 
drinking water, especially as an integral part of the IHS program; the nearly 
universal immunization of Indian children; and emphasis upon com- 
prehensive, community-oriented programs focused on maternal and child 
care. These successes have contributed to changes in the distribution of the 
leading causes of Indian infant mortality, so that the most prominent causes 
now are SIDS, congenital anomalies, injuries, and various infections. Be- 
cause of these changes and advances in knowledge, the IHS has recently 
revised its five-year plan for dealing with infant mortality to provide greater 
attention to injuries and infections and has embarked upon a series of dis- 
cussions with the American Academy of Pediatrics to address postneonatal 
deaths and the difficult problem of SIDS. Low socioeconomic conditions, so 
important in influencing mortality rates (7, 14, 29), have thus far proved to be 
intractable. In the meantime, success will depend upon ensuring optimal 
prenatal care, reducing those risk factors amenable to correction, and solving 
the problem of SIDS. 
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INTRODUCTION 


The public health practice of tobacco control in the United States has evolved 
considerably since the publication of the US Surgeon General’s 1964 advisory 
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committee report on the health consequences of smoking (84). The report 
initially provided the scientific information needed to launch an effective, 
sustained national public health campaign against tobacco (76). As a result of 
this campaign and other healthy lifestyle changes among Americans, the 
prevalence of smoking among adults has declined from 40.4% in 1965 to 
29.1% in 1987 (32). Mortality rates for chronic diseases, such as ischemic 
heart disease, lung cancer, and chronic obstructive lung disease, were greatly 
affected by the increase in tobacco consumption after World War I to the 
mid-1960s. Subsequently, mortality rates for cardiovascular disease have 
declined since the 1960s, and lung cancer mortality rates among men have 
declined since 1985 (2, 66). The antitobacco campaign has successfully 
reduced tobacco use since 1964. Because of this reduction, an estimated 
789,000 deaths due to tobacco use were avoided or postponed during the 
period 1964-1985 (84). Further progress against tobacco-related diseases 
depends on systematically incorporating effective public health programs at 
the state and local levels. 

In September 1990, the Secretary of Health and Human Services released 
the Year 2000 Objectives for the nation. Tobacco use was prominently 
featured in these objectives (80), which provide realistic goals for states and 
the nation in the public health practice of tobacco control. The objectives 
include six specific tobacco-related risk reduction goals and seven services 
and protection goals (Appendix). Periodic evaluation reports on progress 
toward achieving the objectives will be developed by different agencies to 
monitor the nation’s progress. 

As the national effort against tobacco matures, the actions of state and local 
health departments become more important. The states have well-defined 
public health powers and functions in relation to personal health services, 
environmental health, health resources, laboratory services, general adminis- 
tration and services, and support of local health departments (46). The Future 
of Public Health, a recent report by the Institute of Medicine, emphasized that 
states are and must be the central force in public health (44). Tobacco use is a 
public health problem that needs to be strongly addressed by these agencies. 
The essential elements of health department tobacco prevention and control 
programs have not yet been fully implemented or evaluated. This paper 
describes some of these elements and provides examples from the national 
effort that can be applied in the states. Future tobacco-control initiatives 
geared toward states are outlined. 


ESSENTIAL ELEMENTS OF TOBACCO PREVENTION 
AND CONTROL 


Effective strategies for tobacco control derive from those that have proved 
effective in reducing the population burden of communicable diseases: sur- 
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veillance; increases in host resistance through immunization and improvement 
in general health; breaking the chain of transmission through case detection, 
containment, clinical treatment, control of vectors of transmission, environ- 
mental control, and support of personal measures to avoid exposure to the 
infectious agent; inactivation of the infectious agent through physical methods 
and treatment; and planning, implementation, and evaluation of control pro- 
grams (20). Many of these strategies can be applied to the control of chronic 
diseases caused by tobacco use. 

In 1989, the Association of State and Territorial Health Officials (ASTHO) 
conducted a survey of state health departments on programs, policies, and 
public health systems to prevent and control tobacco. The survey provided 
detailed data on components of state tobacco control programs, including 
budgets, planning activities, community activities, legislation, educational 
activities, and health department policies. The results showed that states 
varied widely in the strength and breadth of tobacco-control programs (21). 


Surveillance 


Communicable disease surveillance has been defined by the World Health 
Organization (WHO) as “ . . . the exercise of continuous scrutiny of, and 
watchfulness over, the distribution and spread of infections and factors related 
thereto, of sufficient accuracy and completeness to be pertinent to effective 
control” (20). Tobacco-related surveillance systems must also be simple, 
informative, uniform, and sensitive to changes in behavior, especially among 
target groups for whom interventions are planned. Such surveillance is critical 
in evaluating the long-term effects of tobacco control measures. 


ADULT SURVEILLANCE Opn the national level, several different surveys 
have provided extensive information on trends in tobacco-related knowledge, 
attitudes, beliefs, and behavior (Table 1). 

The Office on Smoking and Health’s Adult Use of Tobacco Surveys 
(AUTS) have provided detailed information on tobacco-related behavior, 
beliefs, and attitudes. In assessing public knowledge about the harmful effects 
of tobacco, these surveys found that beliefs about health consequences of 
smoking increased significantly between 1964 and 1986 (Table 2) (54). 

Data from the National Health Interview Survey (NHIS) have helped 
identify high-risk groups for targeting in the Year 2000 Objectives. For 
example, educational attainment has been found to be the single most impor- 
tant predictor for changes in the prevalence of current smoking (60). Those 
persons with the least educational attainment have not shown the same rapid 
decline in prevalence as those with the highest educational attainment. Cur- 
rent smoking prevalence declined among both men and women between 1973 
and 1987, but the rate of decline was greater among men (Figure 1) (32). 
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Table 2 Percentage of US adults who believed smoking causes 
disease, 1964 and 1986° 








Year and Chronic 


smoking status Lung cancer Heart disease Lung disease 





Smokers 
1964 53 S 42 
1986 85 85 
Nonsmokers 
1964 74 55 
1986 95 6 9] 





“From Ref. 54. 


Epidemiologists predict that smoking will be more common among women 
than among men in the US by the late 1990s (61). 

Beginning in 1985, a sufficient number of blacks were sampled by the 
NHIS to analyze smoking trends and race differences in behavior. The 
diffusion rate for the decline in smoking prevalence among blacks is similar to 
that among whites, even though the prevalence of smoking among blacks is 
higher than among whites in every survey year (32). When the 1985 NHIS 
data were adjusted for sociodemographic factors, blacks were found to have 
lower rates of quitting, but no differences were observed among blacks and 
whites in ever-smoking rates (56). 

The 1982-1984 Hispanic Health and Nutrition Survey (HHANES) found a 
higher prevalence of current smoking among men (40%) and women (26%) 
than in the general population. The HHANES also revealed a remarkable 
prevalence of smoking among Cuban-American men aged 20-34 years 
(50.1%) (42). 

State-specific data from the Centers for Disease Control (CDC) Behavioral 
Risk Factor Surveillance System (BRFSS) and the two Current Population 
Surveys (CPS) for 1985 and 1989 indicate that the prevalence of smoking is 
highest in the South and East (50; Office on Smoking and Health 1991, 
unpublished tabulations). State-specific progress toward the Year 2000 
Objectives, with respect to current smoking prevalence (Appendix), can be 
measured by using these data (Table 3). By using the 1985 CPS data, with an 
estimated —0.5 percentage point change per year in adult smoking preva- 
lence, Remington et al (64) predicted that only nine states would reach the 
1990 Objective of 25%. By using the 1989 CPS and the national rate of 
change for projection of the state-specific adult prevalence of smoking to the 
Year 2000 [—0.58 percentage points change per year (84)], we found that 
only four states would meet the objective of 15% prevalence (Table 3). 

In addition to adult smoking prevalence, the BRFSS can now describe a 








292 NOVOTNY ET AL 


™ Percentage of current smokers 





Females 





(8) 1 1 L 1 1 l ae ee | a ns 1 1 1 1 1 
1965 1970 1975 1980 1985 
Year 





Source: NHIS 1965-87 


Figure 1 Prevalence of smoking among adults aged 20 years old or older, United States, 
1965-1987. 


dynamic model in smoking cessation at the state level. Quitting smoking is a 
continuum, with attempts lasting days, months, or years before relapses and 
repeated efforts to quit. In 1990, the BRFSS questionnaire was modified to 
include questions on when, for how long, and how many quit attempts were 
made in the last year. A quit attempt was defined as one day of abstinence; 
a major quit attempt was defined as at least seven days of abstinence; short- 
term quitting was defined as abstinence for less than three months; and 
long-term quitting was defined as abstinence for 3-12 months. Data 
based on these questions will enable states to evaluate recent changes in 
intentions and serious quitting attempts in response to programs at the state 
level. National data on this “quitting continuum” from the AUTS show 
that 34% of persons who smoked in the last year quit for at least one day 
and that 22.5% of the attempts resulted in abstinence for at least three 
months (41). 

Data from several different surveys suggest that the prevalence of smoking 
among pregnant women remains an area of concern. For the 42 states 
participating in the 1988 BRFSS, the collective prevalence of smoking among 
500 pregnant respondents was 18% (9). In the Year 2000 Objectives, the 
baseline estimate of current smoking prevalence for pregnant women was 
25% in 1987 (80). Small sample sizes for the BRFSS do not permit state- 
specific assessment of smoking prevalence among pregnant women. 
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Table 3  State-specific prevalence of cigarette smoking among 
adults aged 20 years and older, 1989 and projected for the year 
2000°* 








State 1989 Projected for Year 2000 





Alabama 29.3 22.9 
Alaska 28.6 22.2 
Arizona yn 19.1 
Arkansas 28.6 22.2 
California 20.0 13.6 
Colorado 26.2 19.8 
Connecticut 28.8 22.4 
Delaware 29.9 23.5 
District of Columbia 21.3 14.9 
Florida 25.0 18.6 
Georgia 27.0 20.6 
Hawaii 22.1 15.7 
Idaho 23.0 16.6 
Illinois 25.9 19.5 
Indiana 25.1 18.7 
Iowa 25.0 18.6 
Kansas 23.4 17.0 
Kentucky 31.2 24.8 
Louisiana 27.5 21.1 
Maine 20.6 
Maryland 

Massachusetts 

Michigan 

Minnesota 

Mississippi 

Missouri 

Montana 

Nebraska 

Nevada 

New Hampshire 

New Jersey 

New Mexico 

New York 

North Carolina 

North Dakota 

Ohio 

Oklahoma 

Oregon 

Pennsylvania 

Rhode Island 

South Carolina 

South Dakota 

Tennessee 

Texas 

Utah 
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Table 3 (Continued) 








State Projected for Year 2000 





Vermont 25.6 19.2 
Virginia 25.6 19.2 
Washington 23.0 16.6 
West Virginia 27.8 
Wisconsin 24.0 17.6 
Wyoming 27.1 20.7 


United States 25.5 19.1 





*From Office on Smoking and Health. 1989 Current Population Survey, 
US Bureau of the Census, unpublished tabulations. 


Another important surveillance system for states may be the use of data 
from birth certificates. These will include information on maternal smoking 
status in all states (33). The data will assist states in tracking possible 
smoking-associated infant mortality caused by Sudden Infant Death Syn- 
drome, low birthweight, and respiratory conditions. One study using data on 
more than 300,000 births in Missouri found that approximately 10% of infant 
mortality was attributable to smoking (49). This system does not account for 
exposure of the newborn to smoking by household members other than the 
mother. 

The 1989 ASTHO survey reported that only ten states collected data on 
specific target populations, primarily women of reproductive age (21). Thus, 
standardized surveillance of high-risk groups should be strengthened to moni- 
tor progress toward the Year 2000 Objectives. 


ADOLESCENTS In the US, about 90% of smokers begin to use tobacco 
before age 21 (84). Useful trend data have been provided by the National 
Institute on Drug Abuse (NIDA) High School Seniors yearly survey (45). The 
prevalence of daily cigarette smoking among high school seniors decreased 
from 29% in 1975 to 21% in 1980. After 1980, the prevalence leveled off at 
18% to 21% (Figure 2). Since 1976, prevalence of daily cigarette smoking 
among females has consistently exceeded that of males. 

The CDC’s 1989 Teenage Attitudes and Practices Survey (TAPS) will 
permit detailed national analyses of all forms of tobacco use and of the 
predictors of tobacco use among young persons. Preliminary data from this 
survey suggest that the national prevalence of smoking (any cigarette in 
the last 30 days) among persons aged 12-17 years is 15% (11). The CDC 
has also recently released the Youth Risk Behavior Survey (YRBS), a stan- 
dard questionnaire to be used in school surveys. This survey contains be- 





PUBLIC HEALTH AND TOBACCO CONTROL 295 


os Percentage of daily smokers 





Females 


Males \y las aac 
i. eA aaa ame = ie 


\ _ 
qa —B_ 








1 al = 1 





1980 1985 
Year 


0 
1975 


Source: NIDA, Monitoring the Future Project 


Figure 2. Percentage of high school seniors reporting daily cigarette smoking, United States, 
1975-1990. 


havioral questions standardized to the national TAPS (39) and can provide 
state-specific information on tobacco use by school children. In addition, 
CDC used the YRBS to survey a national sample of 11,631 students in 50 
states and the District of Columbia in 1990. These data may be compared with 
state data when available. In 1990, more than one third (36.0%) of school- 
aged youths (grades 9-12) reported that they had smoked at some time in the 
30 days before the survey, and 13% reported that they had used cigarettes 
“frequently” (12). Assessing the smoking behavior among school dropouts, a 
high-risk group, is problematic for the YRBS and all other school-based 
surveys. 

One other source of data for smoking behavior among young persons is the 
CPS. This survey collects state-specific information on persons aged 16-19 
years, whether or not they are in school. No reported analyses have used these 
data, and it is not known how many of the small sample of 16—19-year-olds in 
each state are dropouts. 

The ASTHO survey reported that 34 states collected information on the 
prevalence of tobacco use among adolescents, but none of the states’ surveys 
covered all the standard questions included in the YRBS (21). These ques- 
tions cover experimentation, current use of tobacco, age of initiation, and 
smokeless tobacco use. Thus, by using the YRBS, standardized surveillance 





296 NOVOTNY ET AL 


of youth will be strengthened so that states can evaluate programs and 
progress toward specific Year 2000 Objectives. Core questions will be used in 
both national and state-based surveys to assure comparability (80). 


PUBLIC OPINION POLLS Information from such sources as the Gallup 
Organization Surveys provides some idea of the coverage of public informa- 
tion campaigns and important ongoing information about public beliefs and 
attitudes toward tobacco. These data may be important in formulating public 
policy. For example, in 1988, the Gallup Survey reported that 60% (75% of 
smokers and 26% of nonsmokers) favored a total ban on smoking in public 
places, and 55% (64% of nonsmokers and 34% of smokers) favored restric- 
tions or a ban on cigarette advertising. In addition, NHIS and AUTS data 
show that attitudes among adults toward environmental tobacco smoke sup- 
port additional restrictions on smoking in public places (16, 26). This type of 
information supports the enactment of health-policy interventions by inform- 
ing legislators about the true public sentiment toward controversial bills (84). 
Such surveys also help evaluate the effect of public information campaigns 
through questions that ask about the recognition of material presented in a 
targeted community (62). The analysis of data from public information 
surveys is a useful method of identifying “high-risk” groups in which anti- 
tobacco campaign messages have not been fully received. 


PROCESS MEASURES The ASTHO survey compares state health depart- 
ment’s current activities in tobacco prevention and control and establishes a 
baseline for measuring future progress toward Year 2000 Objectives 3.10, 
3.11, 3.12, 3.13, 3.14 (Appendix) (21). 

Data on worksite policies are not available on a state-specific basis, but 
national data can be obtained from various national sources, such as business 
groups and unions (8, 30). States may collect worksite data from local 
resources, such as business groups, chambers of commerce, or specific 
worksite surveys. 

If schools are targeted to be tobacco-free, state departments of education 
should establish surveillance systems so that progress towards this goal can be 
measured. The National School Boards Association has conducted two 
national surveys of school boards to evaluate smoking policies. In 1989, 78% 
of all responding school boards had antismoking programs, and 95% had a 
written policy on smoking in schools (17). Antitobacco educational programs 
in schools may be considered “immunization” for students against cues that 
encourage tobacco use. Assurance of the application of this education is 
necessary to provide youth with resistance skills needed to meet Year 2000 
Objectives. The National Cancer Institute (NCI) has developed guidelines for 
antitobacco education in schools (34). 
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EXCISE TAX DATA AND TOBACCO CONSUMPTION Measurement of state- 
specific tobacco consumption is possible by using data on excise taxes collected 
on tobacco products (73). These data are usually expressed in terms of per capita 
cigarette consumption for adults aged 18 years and older. A recent report 
demonstrated that for 1974-1985, self-reported consumption (including data from 
surveys of youth, recent quitters, and current smokers) was about 30% lower than 
consumption based on cigarette sales. The report concluded that a consistent bias 
has remained between self-reported consumption data and actual tobacco sales 
data over time. Thus, the validity and reliability of survey data on smoking 
behavior have not changed in recent years (40). A correlation between significant 
social and health information events and national per capita consumption has been 
observed over several decades (Figure 3). Similar tracking of significant state- 
level events, such as increases in excise tax, may be used by states to monitor 
changes in consumption. In California, the total number of cigarettes purchased 
per month is used to track tobacco sales in response to implementation of an 
additional excise tax on cigarettes that took effect in January 1989 (43, 74) 
(Figure 4). 

Because of year-to-year fluctuation in inventory at the wholesale level (at 
which excise taxes are collected), these data are best reported as three-year 
moving averages, or as 12-month averages for monthly changes. 
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Figure 3 Adult per capita cigarette consumption and major smoking and health events, United 
States, 1900-1985. 
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Figure 4 Total cigarettes sold in California from 1980 through 1990. 


Problem Assessment 


The second element of tobacco control activities in the US is problem 
assessment. This process consists of a detailed analysis of current smoking 
behavior, tobacco consumption, current program capabilities, and disease 
impact of smoking (smoking-attributable morbidity, mortality, and economic 
costs). We described the first three aspects of this analysis in the first section 
of this report. 

The analysis of the disease impact of smoking in the US includes both 
standard epidemiologic concepts (attributable risk calculations based on prev- 
alence and relative risk), as well as economic cost estimates based on pre- 
valence of risk factors and relative rates of medical care utilization and 
disability. The critical calculation in prevalence-based disease impact estima- 
tion is the attributable fraction formula: 


P(RR-1) 


Smoking-attributable fraction = —————_ 
p(RR-1) + 1 


where p is the prevalence of smoking and RR is the relative risk of death from 
a particular disease for smokers compared with never-smokers (87). The 
relative risks for 14 smoking-associated conditions were reported in the 
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Surgeon General’s 1989 report (84). These risk estimates and 1988 smoking 
prevalence and national mortality data indicate that an estimated 434,000 
deaths were attributable to smoking in 1988 in the US (10). Economic 
estimates based on direct medical care costs and indirect losses attributable to 
disability and premature mortality have also been made. One national es- 
timate is approximately $65 billion in smoking-attributable economic costs 
for 1985 (86). 

Each state may individualize its smoking-attributable disease impact es- 
timate by using software [Smoking Attributable Mortality, Morbidity, and 
Economic Costs (SAMMEC, and its successor SAMMEC II)] specifically 
designed for this purpose (69). State-specific mortality, 1985 CPS prevalence 
data, and economic data from the Health Care Financing Administration were 
used to compile a 50-state estimate of smoking-attributable morbidity, mortal- 
ity, and economic costs for the National Status Report on Tobacco and Health 
(81). In this report, the number of smoking-attributable deaths for 1985 
ranged from 271 in Alaska to 28,533 in California. The range of direct and 
indirect prevalence-based economic costs of smoking was $82.3 million in 
Alaska to $5.8 billion in California in 1985. By performing disease impact 
estimation at several-year intervals, behavioral surveillance information from 
CPS or BRFSS may be utilized to demonstrate state-specific patterns of 
tobacco-related mortality over time. These data are useful in reinforcing the 
importance of the tobacco-related disease burden (compared with other risks) 
to policymakers (88). The Institute of Medicine’s report stressed the federal 
capacity-building role in disseminating data and information useful to states. 
SAMMEC II software is an example of such public health capacity building. 

Another source of information about the mortality impact of smoking for 
states may be death certificates. Tobacco-use disorder is a specific category in 
the /nternational Classification of Diseases, ninth revision; some researchers 
recommend that this diagnostic category (405.1) be used more frequently by 
physicians in the assignation of cause of death (63). In 1989, five states 
recorded smoking history on death certificates (21). Few analyses of these 
data have been reported, but researchers in Oregon have used follow-back 
surveys of physicians to compile detailed data on the listing of smoking as a 
cause of death certificates. The researchers found that smoking was recorded 
as a contributing or underlying cause on death for 77.1% of lung cancer 
deaths (37). This is remarkably close to the mathematically derived national 
smoking-attributable fraction (87%) of lung cancer deaths reported by the 
Surgeon General (84). 


Legislation and Policies 


Policies and legislative actions are essential to state and local public health 
efforts in meeting the Year 2000 Objectives. They refocus tobacco use as a 
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community public health concern, rather than as simply an individual be- 


havior problem. In 1983, WHO’s Expert Committee on Smoking Control 
stated: 


It may be tempting to try introducing smoking control programmes without a legislative 
component, in the hope that relatively inoffensive activity of this nature will placate those 
concerned with public health, while generating no real opposition from cigarette man- 
ufacturers. This approach, however, is not likely to succeed. A genuine broadly defined 
education programme, aimed at reducing smoking must be complemented by legislation 
and restrictive measures . . . (90) 


Policies and laws that regulate smoking in public places, the access to tobacco 
by children and youths, and tobacco product advertising (at least on public 
property) are within the jurisdiction of state and local governments. 


CLEAN-INDOOR-AIR POLICIES The Year 2000 Objectives call for com- 
prehensive laws on clean indoor air in all states, for 75% of all worksites to 
have a formal smoking policy that prohibits or restricts smoking, and for all 
schools to have tobacco-free environments (80). In the US, state clean- 
indoor-air laws have become more widespread and stronger over the last two 
decades because of concerns about the health consequences of environmental 
tobacco smoke and a growing “nonsmokers rights” movement. Nominal laws 
regulate smoking in one to three public places, excluding restaurants and 
private worksites; basic laws regulate smoking in four or more different public 
places, excluding restaurants and private worksites; and moderate laws regu- 
late smoking in restaurants, but not private worksites. Laws that cover private 
workplaces (as opposed to those only covering public-sector worksites) are 
considered “extensive” laws by the Surgeon General (84), because it is 
difficult to enact policies in the private sector. Preemptive laws, which 
appeared first in 1990, are state laws that prohibit local jurisdictions from 
enacting restrictions more stringent than the state law. 

Before the mid-1970s, minimal state legislation restricted smoking in 
public places. As of 1990, 45 states and the District of Columbia have laws 
that restrict smoking in public places (Figure 5); in 16 of these states and the 
District of Columbia, the restrictions also apply to private-sector workplaces 
(21). States in tobacco growing areas are the least likely to have extensive 
clean-indoor-air legislation. However, a trend has begun that may help con- 
vince states to strengthen and enforce their laws as part of a larger national 
effort. 

Another trend, evident since 1990, is the inclusion of preemption clauses in 
state laws. These clauses effectively prohibit local governments from enacting 
stronger clean-indoor-air restrictions than found in the state law. Preemption 
clauses usually signal a victory for the protobacco interests, because they 
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Figure 5 Number of states with laws regulating smoking in public places, by year, United 
States, 1960-1990. 


effectively weaken local efforts that are usually more restrictive on smoking 
in public places than the state laws. As of October 1990, clean-indoor-air 
legislation has included preemption clauses in seven states (72) (Figure 5). 

Local ordinances restricting smoking in a wide range of public areas (e.g. 
restaurants, elevators, hotels, libraries) are found in more than 450 communit- 
ies (3). As a result of these local ordinances, 23% of the US population (57 
million persons) is covered by specific local regulations (21). When the state 
laws are included in this calculation, over 90% of the US population is 
covered by some kind of clean-indoor-air regulation. 

An increasing proportion (from 36% in 1986 to 57% in 1987) of businesses 
have adopted their own restrictions (8), but most businesses report that the 
presence of laws or regulations lead to the adoption or extension of smoking 
policies at worksites (29). Many laws regulating smoking at the worksite 
exclude smaller establishments from coverage; however, small companies 
employ a large proportion of the national workforce (80). One spin-off of 
widespread worksite restrictions on smoking is that worksites with restrictive 
policies are more likely to offer cessation programs (64%) than are companies 
without such policies (38%) (8). 

Governors or municipal administrators may also impose restrictions on 
those offices controlled by the executive branch. The governors of Colorado 
and North Dakota have completely banned smoking in executive-controlled 
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offices. These actions avoid the competing political agendas often encoun- 
tered in legislative actions, but require strong support from state health 
departments in developing and implementing the directives (21). 

Restrictive policies on tobacco in state health department buildings may be 
important in demonstrating official commitment to tobacco control activities. 
The 1989 ASTHO survey reported that all state health departments, with the 
exception of North Carolina and Virginia, had a written policy on smoking in 
state health department buildings (Virginia has since passed a clean-indoor-air 
law that covers public workplaces). However, only 23 of these departments 
completely ban smoking in all health department facilities. Thirty-one states 
permit the sale of tobacco products in health department buildings (21). 

In 1991, the Environmental Protection Agency’s Scientific Advisory Board 
drafted a report that designated environmental tobacco smoke as a Class A 
(human) carcinogen, a substance to which there is no completely safe expo- 
sure level (85). The increasing trend in strength and coverage of restrictions 
on smoking in public will probably be supported by this finding (29) when 
and if the report is officially released. 


RESTRICTIONS ON ACCESS TO TOBACCO BY MINORS Few states have ever 
enforced restrictions on the purchase of tobacco products by minors (14). 
Although 45 states and the District of Columbia prohibit the sale of tobacco 


products to underage persons, only 21 states and the District of Columbia 
require a retail vendor’s license to sell tobacco, and only eight states’ laws had 
clear enforcement provisions. A report by the Inspector General of the US 
Public Health Service concluded that only 32 violations were cited in the five 
states that collected such data in 1990 (82). 

Additional strengthening and enforcement of these restrictions have been 
encouraged by the US Department of Health and Human Services. In May 
1990, the Secretary of Health and Human Services released model legislation 
that would improve the enforcement and coverage of laws restricting minors’ 
access to tobacco (71). Seven essential elements are covered in this model 
legislation: 


1. Create a licensing system similar to that used to control the sale of 
alcoholic beverages. 

. Set the minimum age of legal purchase at 19 years. 

. Set forth a graduated schedule of penalties for illegal sales to minors (fines 
and license suspensions). 

. Provide separate penalties for failure to post warning signs about the 
illegality of sales to minors. 

. Place primary responsibility for enforcement with a designated state 
agency. 
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6. Rely primarily on civil penalties, rather than on the court system, to punish 
offenders. 


7. Ban the use of vending machines to dispense tobacco products. 


RESTRICTIONS ON ADVERTISING Cigarettes are the most heavily advertised 
consumer product (25). In the infectious disease control model cited above, 
advertising may be thought of as a critical vector in the “chain of transmis- 
sion” of the tobacco epidemic. Thus, vector control may require restrictions 
on advertising to prevent initiation of tobacco use by susceptible persons. The 
preemption clause of the 1969 Public Health Cigarette Smoking Act prevents 
state governments from regulating most cigarette advertising (84). However, 
states and localities can and do regulate some local advertising. In Utah, 
tobacco advertising is banned by law from any billboard, public transport 
facility, or any other object of display. Six other states (Arizona, California, 
Colorado, Massachusetts, Hawaii, and Nebraska) have communities that 
restrict tobacco advertising through local legislation on public properties, 
such as sports stadiums and transit facilities. Cities and states can also restrict 
or ban the free distribution of tobacco product samples, and at least 14 cities 
in the US have done so (84). 


INCREASING TAXES Another legislative effort that may effectively inhibit 
young persons from smoking is increasing the excise tax on cigarettes. 
Currently, all states impose a tax on each package of cigarettes. These taxes, 
which range from three cents (North Carolina) to 41 cents (Texas) per pack, 
are generally lowest in the tobacco-producing states (21, 73). According to 
econometric studies by Lewit & Coate (48), the negative price elasticity of 
demand observed for such consumer goods as tobacco is most effective in 
reducing consumption for those who have the least amount of disposable 
income. These authors found that the overall price elasticity of demand was 
—0.42 for cigarettes, but that the value for youths aged 12-17 years was more 
than three times as high (— 1.40). Thus, an increase in the price of cigarettes 
through taxation will particularly inhibit the initiation of smoking by teen- 
agers, although it will also suppress the per capita consumption of cigarettes 
for the population in general. 

Two correlation studies using state-specific consumption data have been 
reported. Changes in excise taxes over time appear to be more closely 
correlated longitudinally with changes in consumption than the enactment of 
clean-indoor-air legislation (59). However, those states with the strongest 
clean-indoor-air acts appear to have the lowest per capita consumption (28). 
In 1981, Warner (89) suggested that the dramatic correlation between diffu- 
sion of clean-indoor-air legislation and cigarette consumption was best in- 
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terpreted as parallel changes in social attitudes toward smoking. The same 
principle probably holds true in 1991. Besides the economic effects of 
increasing cigarette prices, portions of the revenue from excise taxes have 
been earmarked by some states to fund tobacco-control programs (6, 84). 

In 1988, California voters passed Proposition 99, which increased the state 
excise tax on cigarettes from 10 to 35 cents per package of 20 cigarettes; one 
fifth of the revenues from this tax initiative were directed to tobacco-related 
public education (more than $100 million per year). The imposition of a 25 
cent/pack tax increase, combined with a large intervention program, was 
associated with a sharp (15%) decline in tobacco sales in California (Figure 4) 
(74). Hu et al (43) estimated that this decline was 1.22 packs less per person 
(aged 15 years or older), i.e. the effect of the tax is to suppress consumption 
by 1.22 packs over that which would be predicted without the tax. By 
December 1990, however, the effect of the tax had dwindled to 0.64 packs 
less than predicted before the tax. Without constant adjustments for inflation 
or an ad valorem tax, the effect of tax increases diminishes over time. In 
California, the effect of the 1989 tax will be negligible by 1993, if not 
increased. 

The health benefits of increases in cigarette taxes are substantial. One 
report estimated that over 800,000 premature deaths in a 1984 cohort of 
Americans 12 years and older would have been averted if the federal excise 
tax on cigarettes had been maintained at its real value in 1951 (38). The 
positive health effects of increasing cigarette taxes may also be appreciated by 
the states, but because of the long lag periods characteristic of smoking- 
attributable chronic diseases, the benefits may not be observed for several 
years. 


Health Department and Community-Based Programs 


ASTHO TOBACCO PREVENTION AND CONTROL NETWORK In 1989, 
ASTHO developed a network of state health department personnel concerned 
with tobacco issues in each state. These persons generally represented the 
health education or health promotion divisions within the departments, but 
they were also epidemiologists and chronic disease specialists. This network 
now acts as a conduit for information, the dissemination of new technology, 
and communications between states. National meetings, as well as special 
projects of the network, are supported through funding from the NCI, the 
CDC, and the National Heart, Lung, and Blood Institute (4). 

A unique effort supported through the ASTHO network to increase commu- 
nity involvement in tobacco control has been established by the governors of 
eight western states through the Rocky Mountain Tobacco-Free Challenge. 
This challenge has a goal to achieve significant (50%) reductions in tobacco 
use and tobacco-related disease by the year 2000 (18). Over the first three 
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years of this effort, the participating states have increased funding, sur- 
veillance activities, community interventions, and other program components 
that may contribute to an accelerated decline in tobacco use in the region (55, 
57). Such regional efforts help strengthen states’ capacity through collegial 
relationships among states who have similar geography, target populations, 
and political climate. 


NCI INTERVENTION TRIALS In 1988, Cullen (22) articulated the NCI strat- 
egies for reducing smoking in the US, so that the NCI goal of reducing the 
cancer mortality rate 50% by the Year 2000 could be met. A key component 
in these strategies was the ambitious community-based intervention trial for 
smoking cessation among smokers in 22 different sites (COMMIT) (52). This 
trial, the largest in the NCI repertoire, will involve almost 2 million persons in 
the application of smoking cessation strategies through community organiza- 
tions and social institutions. Heavy smokers (= 25 cigarettes per day) are 
prime targets of these interventions. The rationale for the intervention is based 
on data from community heart disease prevention trials in the US. These data 
emphasize that multiple interventions should be incorporated into natural 
educational channels and social structures that have the potential to reach 
large segments of the smoking population (52). For instance, worksite promo- 


tion of cessation through presentations, cessation programs, development of 
audiovisual materials, and consultations are a major focus of COMMIT 
activities. The COMMIT approach is based on previous NCI community- 
based research. The overall intervention goals of the project are the following: 


. Increase the priority of smoking as a public health issue. 

. Improve the community’s ability to modify smoking behavior. 

. Increase the influence of existing policy and economic factors that dis- 
courage smoking. 

. Increase social norms and values supporting nonsmoking. 


Results from the COMMIT project are not yet available. 

Besides COMMIT, NCI has funded individual interventions in eight specif- 
ic problem areas. More than 10 million individuals in 25 states and over 299 
cities are affected by these efforts. The interventions include the following: 

1. School-based programs. These comprehensive programs teach social 
pressure resistance to students, involve parents and peer leaders in the educa- 
tion, call for schoolwide support of nonsmoking norms, and emphasize 
longitudinal follow-up. Curricula, which have been developed to target high- 
risk youths, include an emphasis on the stages of change model (infrequent 
use leading to addiction) (34). In 1989, the National School Boards Associa- 
tion found that 95% of school districts in the US had a written policy on 
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smoking in schools and that 17% of schools banned smoking on school 
premises or at school functions (17). The ASTHO survey found that 26 states 
and the District of Columbia ban smoking for students, and only eight states 
completely ban smoking in schools for both students and staff. However, only 
23 states and the District of Columbia are able to report information on 
policies and education activities in school districts, which are basically auton- 
omous units within their jurisdictions. 

2. Minimal interventions. Self-help programs are the most cost-effective 
method of delivering cessation messages (35). These can be supported 
through telephone hot lines, social support groups, worksites, newsletters, 
community groups, manuals, and health care providers (23). 

3. Health care providers. Approximately 700 providers and more than 
40,000 smokers were originally expected to be targeted by this intervention 
(22). Through wide dissemination of an NCI “Train the Trainer” program, 
cosponsored by the American Medical Association and the American Dental 
Association, over 100,000 physicians and dentists will be trained to counsel 
smokers (36). The Year 2000 Objectives for the nation call for 75% of 
primary care and oral health care providers to advise cessation routinely and 
provide assistance and follow-up for all tobacco-using patients (80). Over 
70% of smokers visit a health care provider once a year, which may be 
considered a “teachable moment” in a time of vulnerability. According to the 
1986 AUTS, only 45% of smokers reported that a physician had ever advised 
them to stop smoking (24). 

4. Mass media. Mass media reaches the largest number of smokers (an 
estimated 5 million) in NCI intervention trials. A more detailed description of 
media and communications efforts is presented later. 

5-7. High-risk groups. Blacks, Hispanics, and women are targeted 
through NCI programs that use multiple channels. These high-risk groups 
may be at a lower point in the classic diffusion of innovations curve (in this 
case, nonsmoking is the innovation) than others (65). Their smoking patterns 
may be unique. Thus, targeted interventions are included in the NCI commu- 
nity intervention protocols to improve the diffusion of the nonsmoking norm 
among these groups. 

8. Smokeless tobacco. These NCI intervention efforts concentrate on 
identifying patterns of use and channels through which users may be in- 
fluenced. Channels include 4-H clubs, dental health maintenance organiza- 
tions, Little League, and Native American organizations. 

Overall, the NCI goal is to integrate effective cancer control technology 
into existing health care delivery systems, health promotion efforts, or cancer 
control programs (22). Several consensus documents have been published that 
deal with this integration. They include consensus reports on health mainte- 
nance organizations, pregnant women, self-help programs for smoking cessa- 
tion, and school-based programs (70). 





PUBLIC HEALTH AND TOBACCO CONTROL 307 


State health departments are the focus for the next stage (Phase V research) 
in the demonstration of NCI COMMIT and other program results. In 1991, 
the NCI mounted the world’s largest demonstration project for tobacco con- 
trol and health promotion. The American Stop Smoking Intervention Study 
(ASSIST) is sponsored by both the NCI and the American Cancer Society. 
Almost $120 million will be spent in 17 states to support this project between 
1991 and 1999, and it will include an extensive evaluation by using data from 
future Current Population Surveys (51). 


CESSATION PROGRAMS Smoking cessation programs (the clinical treatment 
component of the disease control model) are essential in the multiple channel 
approach to tobacco control. These are provided by many private voluntary 
health organizations, state and local health departments, for-profit companies, 
hospitals, and schools. Although the majority (90%) of successful quitters do 
not use formal programs (31), heavier (more than 25 cigarettes per day), more 
addicted smokers tend to use programs more frequently than lighter smokers, 
thus demonstrating a need for these programs within broad-based tobacco 
control activities (31, 79). 

The average success of cessation programs depends on their clientele, the 
methods used, the definition of “success,” and other factors, but most pro- 
grams that report follow-up evaluation studies show about 20-40% success 
rates at one-year of follow-up (68). Rates for voluntary participation in 
cessation programs are low (31). Moreover, medical insurance does not, as a 
rule, cover payment for such programs, and cessation programs are often not 
appropriately designed for, nor accessible to, the most hard-to-reach pop- 
ulations. Thus, the recent proliferation of smoking cessation services may not 
be associated with widespread behavioral changes. In 1989, 34 states and the 
District of Columbia offered smoking cessation programs to state health 
department employees, and 26 states offered such programs to members of 
the community (21). 


Economic incentives and deterrents may be effective in supporting cessa- 
tion. The state governments of Colorado, Kansas, and Washington offer 
differential health insurance rates to smokers and nonsmokers (21). The 
impact of these programs may be widespread, because the state government is 
the largest or one of the largest employers in many states. Once again, the 
state can act as a change agent in setting an example for other providers of 
health insurance. 


HIGH-RISK GROUPS’ The NCI has placed emphasis on special populations, 
including blacks, Hispanics, and women, in their individual community trials 
(70). These trials focus on late adaptors of the nonsmoking norm and on those 
for whom the health consequences of smoking are of particular concern, such 
as heavy smokers. In 1989, most states (n = 38) had programs that included 
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education and information for some or all of these groups (21). Most states (n 
= 37) addressed women of reproductive age, and fewer states had programs 
for the other groups: youths (20 states), blacks (14 states), Hispanics (11 
states), Native Americans (eight states), Asian/Pacific Islanders (three states), 
and elderly adults (three states). The CDC has also supported the Smoking 
Cessation in Pregnancy Project in three states (Missouri, Colorado, Mary- 
land), which will be disseminated to others through existing state resources. 
In Colorado, preliminary data suggested that an intensive intervention di- 
rected toward pregnant women who attend public clinics can produce a 50% 
improvement in quitting (13.9% of women in the experimental group and 
9.3% of women in the control group quit in response to the intervention) (91). 


Public Information Campaigns 


The effect of past media campaigns has been seen in the decline in tobacco 
consumption associated with public service announcements in the early 1970s 
(Figure 3). Over the past 25 years, media campaigns have been developed for 
specific groups (minorities, pregnant women, and adolescents) and have 
contributed to the reduction in tobacco use by these groups. Unfortunately, 
the tobacco industry has also systematically targeted many of these same 
groups (12, 25). 

At first, communications campaigns were “passive,” in that information 
was provided to the public so that personal decisions could be made based on 
accurate health information. Now, media also serves to change attitudes, 
reinforce and maintain interest, provide cues to simple action, set a social 
agenda, and demonstrate simple skills (53). Media messages are now likely to 
provide a stimulus to positive behavior change because of a changed social 
milieu and a “conditioned” target audience that supports nonsmoking (27). 
The recent competitive and complex media environment and the increase in 
intense counterefforts of the industry have resulted in a need for increased 
communication capacity in tobacco control by state and local health de- 
partments. The NCI has published a consensus document on media and 
tobacco control that is particularly applicable to state and local health de- 
partments (53). 

Social marketing is a technique for creating a need for a particular product 
or service (7). Social marketing can also create a milieu in which tobacco use 
is no longer the norm, thus facilitating change among users and discouraging 
young persons from beginning to use tobacco (27). Public information cam- 
paigns are the cornerstone of social marketing efforts. 

Training constituency groups (in particular, health department personnel) 
in media relations is another key component of successful public health 
communications efforts (27). These groups can then disseminate scientific 
information to the public about tobacco-related disease and the need for 


, 
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tobacco-control programs. State involvement in media now calls for a com- 
bined, systematic approach that uses media relations to build a media con- 
stituency for the issue; public information that can craft appropriate preven- 
tion and cessation messages for target audiences through appropriate chan- 
nels; and media advocacy to focus media attention on tobacco issues. The 
1989 ASTHO survey indicated strong public information activities by states. 
Thirty-one states and the District of Columbia originated public information 
campaigns within the last three years. All but seven states made use of public 
service announcements produced by such federal agencies as the NCI and 
CDC (21). 

An increasing number of states, including California, Michigan, and Min- 
nesota, are appropriating public funds to buy media time for nonsmoking 
advertisements. In California, $28.6 million were dedicated to the media 
campaign alone in 1989-1990. Because paid advertising allows for more 
control over the message, these messages can have more of an impact than 
those that use free time. However, the costs for paid advertising can be 
prohibitive (27). The NCI has recently published guidelines on the use of paid 
media in tobacco control (58). 

Computer networks permit the rapid dissemination of information and 
exchange of ideas. An electronic bulletin board specifically dedicated to 
tobacco, the Smoking Control Advocacy Resource Center Network (SCARC- 
Net), has been developed by the Advocacy Institute in Washington, DC (1). 
To date, 13 states participate in this service (Advocacy Institute 1991, per- 
sonal communication), which includes daily news briefings, information 
exchange, and issue updates. With complete participation in this service, 
coordinated responses by health departments to constantly changing tobacco- 
related issues would be possible. 


Technical Information Collection and Dissemination 


More than 60,000 articles have been written about tobacco and health issues 
(84). The database of information on tobacco use is expanding daily; the 
scientific community and the public need to be kept informed about progress 
in controlling tobacco use. Thus, the maintenance and dissemination of 
technical information about tobacco is a major component of the public health 
practice of tobacco control. The Office on Smoking and Health’s Technical 
Information Center maintains the database by using inhouse library software 
(STAR) and a standard database for medical literature (DIALOG, File 160). 
New publications are included in the quarterly Bibliography on Smoking and 
Health (78). This document covers about 2000 citations and abstracts each 
year in the field of tobacco and health. All of these resources are directly 
accessible to states. 

Brief, clear, technical information that has local relevance is an important 
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tool for state and local health departments in efforts to reduce tobacco use 
among their constituents. A wealth of factual resources, as well as cessation 
and prevention materials, exists for both the public and health professionals. 
An important function of state tobacco control programs is to provide access 
to these products and apply them at the local level. 


Coalition Building, Community Planning, and Evaluation 


COALITIONS Coalitions are an integral part of tobacco-control activities in 
the US. The more a coalition extends beyond the health community, the more 
ownership the entire community exerts over tobacco-control initiatives. 
Coalitions may represent public health officials, health care providers, 
advocacy groups, voluntary health organizations, business groups, religious 
groups, government officials, the insurance industry, the legal profession, the 
military, labor organizations, economists, educators, advertisers, and com- 
munications specialists. Activities of coalitions include advising the state 
health department, lobbying for antitobacco legislation, developing and im- 
plementing tobacco-control plans, conducting research and evaluation, and 
providing public and professional education. Based on data from the ASTHO 
survey, 49 states and the District of Columbia have tobacco-related coalitions, 
with membership including an average of 13 disciplines (13, 21). Their most 
important activities included public education (82%), legislative efforts 
(71%), professional education (47%), planning for tobacco control (45%), 
and research and evaluation (26%). In North Dakota, a well-organized coali- 
tion sought and received block grant funds from the Maternal and Child 
Health Program. Even small amounts of funding, such as $10,000, can be 
effective in supporting coalition activities (57), but most state-level coalitions 
are unfunded (21). 


PLANNING _ State health departments have created plans to solve specific 
health problems, such as tuberculosis, sexually transmitted diseases, and 
measles. The Year 2000 Objectives call for all states and territories to create, 
implement, and monitor tobacco prevention and control plans (80). Planning 
articulates the process of public health practice, unites the disparate change 
agents, and focuses public health programs on the spectrum of tobacco control 
activities. 

According to the ASTHO survey, nine states had separate public health 
plans for tobacco use in 1989, and 16 states addressed tobacco use as part of 
another program plan (15). To assist states with tobacco-control plans, 
ASTHO analyzed existing plans and published a guide for developing control 
plans (5). The steps for developing a tobacco-control plan include the follow- 
ing: 
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. Utilize national expertise and resources and establish a coalition or ad- 
visory group. 

. Assess the tobacco problem. 

. Develop the mission, goals, and objectives of the plan. 

. Analyze existing tobacco-control potentials. 

. Package and market the plan. 

. Evaluate and revise the plan. 


To date, no formal evaluations of state tobacco control plans have been 
published. 


EVALUATION Evaluation studies of state-based tobacco-control programs 
are still rather rudimentary. The data from COMMIT are not yet available, but 
these will only cover 11 intervention and 11 control sites. ASSIST will not 
even begin until 1993 and will only cover 17 states. However, some data for 
use in evaluating states’ progress provide preliminary information about 
overall state activities on tobacco control. In this review, all state activities, 
including NCI trials, state and local legislation, coalitions, and tobacco 
control plans are part of the state-based effort. Comparisons of outcome data, 
such as changes in prevalence, quit rates, per capita cigarette consumption, 
and smoking-attributable mortality for different states and regions, are possi- 
ble by using currently available data sources. The outcomes change slowly in 
response to state antitobacco efforts. For example, the overall prevalence of 
adult smoking is decreasing at only 0.58 percentage points per year in the US 
(32). 

Changes in state-specific current smoking prevalence, quit rates, and per 
capita cigarette consumption have not yet been linked to data on the various 
state tobacco-control interventions. In addition, it is very difficult to differen- 
tiate between cause and effect of these interventions. For example, a state in 
which there is a large change in current smoking prevalence will have a 
population sympathetic to a very restrictive law on smoking in public places. 
The presence of the law is thus the effect, rather than the cause, of favorable 
changes in behavior. However, several community-based cardiovascular 
risk reduction projects have demonstrated that multipronged interventions 
can have significant effects in reducing cardiovascular disease risk factors 
in targeted populations; the evidence for effective intervention is particu- 
larly strong for smoking. In North Karelia, Finland, significant reductions 
in smoking prevalence among men aged 30-59 years (28%) were achieved 
over ten years of intervention. In the Stanford Three Community Study, 
the number of cigarettes smoked per day decreased after two years of in- 
tervention. The most important aspect in the evaluation of these in- 
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terventions is that reduction of risk factor scores produced significant reduc- 
tion in cardiovascular disease outcomes (67). 

It is also important to evaluate behavioral change in response to specific 
policies, such as those restricting smoking in public places. In places where 
evaluation studies have been carried out (e.g. worksites), some policy 
changes have resulted in no changes in overall cigarette consumption, some 
have been associated with a decrease in the daily consumption of cigarettes, 
and some have been associated with a decline in the prevalence of smoking 
(29, 84). Yet, such policies contribute to an overall social norm of not 
smoking. This change may be thought of as an “environmental” intervention 
similar to one that may be effective as part of the infectious disease control 
model. 

The overall goal for the education campaign funded by the recent tax 
increase in California is to reduce the smoking prevalence by 75% by the year 
1999. Preliminary evaluation data on consumption and cigarette smoking 
prevalence indicate that there may be 750,000 fewer smokers in California 
since the application of the tax and education campaign (47). The prevalence 
of smoking declined to 21.2% in 1990 (following the tax increase and 
institution of the educational campaign) from a 1987 baseline of 26.3% (74). 
Six months after the campaign began, preliminary results from the California 
media campaign evaluation showed that the awareness of the campaign was 
86.9% among in-school youths and 78.3% among adults. The proportion of 
adults who think about quitting increased from 38.6% to 41.8%, and the 
proportion of nonsmoking youths who think about starting decreased from 
24.6% to 21.4% during this period (IOX Associates 1991, personal com- 
munication). 

Additional evaluations of the effect of cigarette excise taxes have shown 
that these policies discourage smoking, particularly among teenagers (48, 
84). On a state basis, cigarette excise taxes have contributed to significant 
changes in consumption. Between 1955 and 1988, enactment of state ciga- 
rette tax increases were associated with an average 3% greater decline in state 
cigarette sales than in years without tax increases (59). 

The effect of laws restricting minors access to tobacco can be evaluated 
through several sources, which include survey data on adolescent smoking 
behavior (such as the YRBS), as well as data from law enforcement sources 
charged with enforcing these laws. Vendors’ compliance with laws will 
not ensure behavior changes among adolescents, but the community non- 
smoking norm will be supported through visible enforcement of these laws. 
The CDC is conducting an evaluation of a law restricting minor’s access 


to tobacco in Marquette, Michigan (A. Trontell 1991, personal communi- 
cation). 
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THE FUTURE OF STATE TOBACCO-USE PREVENTION 
AND CONTROL PROGRAMS 


Tobacco-control activities in the US will increase as state and local programs 
are further developed. ASSIST will begin in 1993, just as COMMIT is 
finishing. This multistate program will coordinate, provide training for, and 
evaluate tobacco-use prevention and control efforts in 17 states through 1998 
(51). It will use the technology and resources developed through COMMIT 
and the other eight individual intervention trials sponsored by NCI over the 
last several years (70). 

The Program Directions of the Department of Health and Human Services 
(in particular, Program Direction 3, Objective 2), calls for a reduction in the 
incidence of smoking among high-risk and other groups. Tobacco control 
interventions for minorities, women, youth, and the federal work force are 
called for by these directions. The directions also call for strengthening the 
capacity of the public health infrastructure to reduce smoking, especially 
among high-risk groups. Because so much of the public health practice of 
tobacco control is state-based, the National Institutes of Health and CDC will 
provide increased assistance to states as part of their responsibility to imple- 
ment these objectives (77). 

The successful tax initiative in California may similarly encourage other 
states to fund tobacco control programs. Despite reductions in the second 
round of legislative appropriations for Proposition 99, a substantial portion of 
the program funds will still be directed to the media campaign. Here and in 
other states, media campaigns will continue to expand in importance if 
financial resources are available. 

The Rocky Mountain Tobacco-Free Challenge will continue until the 
year 2000 (18). Key elements of this program include increased community 
interest, strengthened interstate and intrastate collaboration, promotion 
of state activities for reducing tobacco use, and long-term evaluation 
of tobacco-related policies. As additional resources become available, 
other regions of the country may adopt this innovative, competitive ap- 
proach. 

Additional coordination of state and local health department activities will 
be supported by ASTHO through the Tobacco Prevention and Control Net- 
work. These health professionals will serve as the opinion leaders for the 
diffusion of public health practice activities in tobacco prevention and control 
at the state and local levels. The public health practice of tobacco control 
continues to evolve, and evaluation methodologies for tobacco-control activi- 
ties need further development. No single intervention will stop the tobacco 
epidemic. Multifaceted public health activities for controlling tobacco use 
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need continuous assessment and evaluation; as successful strategies emerge, 
they should be adapted to different cultural and social environments. 

Finally, it is important to note the efforts of the tobacco industry in 
opposition to tobacco control programs and policies. About $3.3 billion is 
spent yearly on cigarette advertising—cigarettes are the second most common 
subject of advertising in magazines and the most common in the outdoor 
media (mostly billboards) (12, 25). Each state and nearly every local jurisdic- 
tion considering tobacco-related public health legislation gains the attention of 
protobacco lobbyists. Therefore, the effects of major efforts to control tobac- 
co use in state and local jurisdictions may be masked by the well-funded, and 
often successful, efforts of the tobacco industry in defeating antitobacco 
initiatives. 


APPENDIX: YEAR 2000 OBJECTIVES FOR THE NATION ON TOBACCO AND 
HEALTH 


3.1. Reduce coronary heart disease deaths to no more than 100 per 100,000 

people. 

. Slow the rise in lung cancer deaths to achieve a rate of no more than 42 
per 100,000. 

. Slow the rise in deaths from chronic obstructive pulmonary disease to 
achieve a rate of no more than 25 per 100,000. 

.4. Reduce cigarette smoking to a prevalence of no more than 15% among 
people aged 20 and older. (Several special population targets are speci- 
fied.) 

. Reduce the initiation of cigarette smoking by children and youth so that 
no more than 15% have become regular cigarette smokers by age 20. 

. Increase to at least 50% the proportion of cigarette smokers aged 18 and 
older who stopped smoking cigarettes for at least one day during the 
preceding year. 

. Increase smoking cessation during pregnancy so that at least 60% of 
women who are cigarette smokers at the time they become pregnant quit 
smoking early in pregnancy and maintain abstinence for the remainder 
of their pregnancy. 

3.8. Reduce to no more than 20% the proportion of children aged 6 and 

younger who are regularly exposed to tobacco smoke at home. 

3.9. Reduce smokeless tobacco use by males aged 12-24 to a prevalence of 

no more than 4%. 

3.10. Establish tobacco-free environments and include tobacco use preven- 
tion in the curricula of all elementary, middle, and secondary schools, 
preferably as part of comprehensive school health education. 

3.11. Increase to at least 75% the proportion of worksites with a formal 
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smoking policy that prohibits or severely restricts smoking at the 


workplace. 


. Enact in 50 states comprehensive laws on clean indoor air that prohibit 
or strictly limit smoking in the workplace and enclosed public places. 


3. Enact and enforce in 50 states laws prohibiting the sale and distribution 


of tobacco products to youth younger than age 19. 
. Increase to 50 the number of states with plans to reduce tobacco use, 


especially among youth. 


. Eliminate or severely restrict all forms of tobacco product advertising 
and promotion to which youth younger than age 18 are likely to be 


exposed. 


. Increase to at least 75% the proportion of primary care and oral health 
care providers who routinely advise cessation and provide assistance 
and follow-up for all of their tobacco-using patients. 
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INTRODUCTION 


In the last decade, our understanding of the epidemiology of mood disorders, 
including bipolar disorder, major depression, and dysthymia, has accelerated. 
There is also clearer understanding of the rates and risk factors, comorbidity 
and social morbidity, and the changing patterns of these disorders, both 
nationally and cross-nationally. Notions about the age of onset of these 
disorders have changed considerably (1). Investigators now recognize that 
depression can occur prepubertally and often begins in adolescence and that 
family history (i.e. the presence of a mood disorder in a first degree biological 
relative) is one of the most important risk factors. Considerable data, based on 
controlled clinical trials, are now available on the efficacy of a broad range of 
pharmacologic and psychotherapeutic treatments, for both acute and mainte- 
nance treatment for the various types of the mood disorders. This chapter 
reviews these relatively recent advances in understanding the epidemiology, 
familial nature, treatment, and morbidity of mood disorders. 
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DIAGNOSIS AND CLASSIFICATION 


Mood disorder refers to a group of clinical conditions, whose common feature 
is the patient’s disturbed mood, either depression or elation. This distinction 
does not imply a common etiology. Mood disorders are probably biologically 
heterogeneous, comparable to the situation for mental retardation or jaundice. 
The major distinction in mood disorders is between bipolar and the depressive 
disorders and, within the depressive disorders, major depression and dysthy- 
mia. 

The concept of a mood disorder (sometimes called affective disorder) itself 
is noteworthy. This chapter could not have been written two decades ago. The 
conditions that are today grouped together as mood disorders were treated 
separately in the American Psychiatric Association’s Diagnostic and Statisti- 
cal Manual of Mental Disorders, second edition (DSM-II) as part of either 
psychosis or neurosis, the two predominant psychiatric categories in the 
1960s. 

Over the last decade, diagnostic criteria have been specified for the major 
mental disorders. These criteria are based on type, number, frequency, and 
duration of symptoms, as well as on exclusions, and were codified in 1980 in 
DSM-III (2). In 1987, minor revisions were made in the classification and 
published as DSM-IIIR. The DSM-IV will appear in 1994. 

The DSM-III abolished the distinction between psychotic and neurotic 
conditions and brought together several depressive conditions, which were 
first called affective disorders and, later, mood disorders in DSM-IIIR. The 
separation of depressions into bipolar and major depression is widely accepted 
because of differences in family patterns, effective treatment, and natural 
course. 

Table 1 lists the broad outline of the DSM-IIIR classification of the mood 
disorders. Each disorder can be further classified by severity, whether it is in 
remission, or whether psychotic features are present. Each disorder also has a 
category [Not Otherwise Specified (NOS)] for patients who typically do not 
fit the criteria for subclassification. Each diagnosis has specified criteria. The 
specified criteria for the major categories follow. 


Bipolar Disorder 


The presence of mania defines bipolar disorder. Mania is a distinct period 
during which the predominant mood is either elevated, expansive, or irritable 
and there are associated symptoms, including hyperactivity, pressure of 
speech, racing thoughts, inflated self-esteem, decreased need for sleep, dis- 
tractibility, and excessive involvement in activities that have high potential 
for painful consequences. 

Mania without major depression, sometimes called “unipolar mania,” is 
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Table 1 DSM-IIIR classification of mood disorders 








Bipolar Disorder Depressive Disorders 





Mixed Major Depression 
Manic single episode 
Depressed recurrent 
Cyclothymia Dysthymia 
Bipolar Disorder (NOS)* primary or secondary 
early or late onset 
Depressive Disorder (NOS) 





“Not Otherwise Specified. 


uncommon, but does occur. Bipolar disorder can present as either a manic or 
a depressive state. Cyclothymia, which is a mild chronic form of mood 
swings, is interesting because of its aggregation in the biological relatives of 
patients with bipolar disorder. For this reason, cyclothymia is considered part 
of the spectrum of bipolar disorder. However, it is often difficult to differen- 
tiate the boundaries between cyclothymia and normal moods. 


Depressive Disorders 


MAJOR DEPRESSION The essential feature is either a dysphoric mood or a 
loss of interest or pleasure in all or almost all of usual activities and pastimes. 
The disturbance is prominent, relatively persistent, and associated with other 
symptoms, including appetite disturbance, change in weight, sleep dis- 
turbance, psychomotor agitation or retardation, decreased energy, feelings of 
worthlessness or guilt, difficulty concentrating or thinking, and thoughts of 
death or suicide or suicidal attempts. Major depression is only diagnosed in 
the absence of current or past manic symptoms. Although there is general 
agreement that major depression is a heterogenous disorder, there is no 
consensus or little empirical basis for most of the subtypes used clinically, 
such as endogenous, seasonal, or melancholic depression. 


DYSTHYMIA The essential feature is a chronic disturbance in mood, involv- 
ing either depressed mood or loss of interest or pleasure in all or almost all 
usual activities and pastimes, and associated symptoms, but not of sufficient 
severity or duration to meet the criteria for major depression. The primary 
distinction between dysthymia and major depression is that the former is 
chronic, but symptomatically less severe, and must persist for at least two 
years to meet the criteria. Whether dysthymia is an independent disorder or a 
variant of major depression is controversial. At least some dysthymias are 
probably prodromal of major depression or the residual of untreated major 
depression. At this point, these issues have not been resolved, and the 








322 WEISSMAN & KLERMAN 


DSM-IIIR divides dysthymia by whether it occurs primary or secondary to 
another disorder and by age of onset, with age 21 being the division. 

Many patients have depressive symptoms, but do not meet the criteria for 
either dysthymia or major depression and do not have manic episodes. These 
patients often appear in primary care and medical clinics and are important 
from the public health point of view, because of their high prevalence and 
disability. These depressive symptoms that do not meet criteria do not appear 
in the official DSM-III nomenclature or may be classified as an adjustment 
disorder that persists for more than six months secondary to major identifiable 
psychosocial stresses. 


EPIDEMIOLOGY 


With the exception of a few European studies, epidemiologic approaches 
were infrequently applied to the study of psychiatric disorders in the commu- 
nity until the 1970s. The major obstacles were the lack of specification of the 
diagnostic criteria and the difficulty in obtaining reliable diagnoses. 

In the US, the period after World War II was one of considerable activity in 
the epidemiology of mental impairment and health. Several classic 
epidemiologic studies of this period were completed, including studies that 


showed the relationship between social class and mental illness, the effects of 
changing traditions and values in a small town, and the effects of urban life. 
These studies demonstrated the importance of poverty, urban social stress, 
and social change in the development of impairment. The investigators used 
sophisticated statistical and sampling techniques. However, symptom or im- 
pairment scales that did not generate rates of specific psychiatric disorders 
were used. The findings from these studies did not have a major impact on 
clinical psychiatry (3, 4). 

Psychiatric epidemiology, clinical psychiatry, and clinical research did not 
begin to converge until the mid-1970s. The introduction into psychiatry of 
specified diagnostic criteria, which had standardized methods of assessing 
signs and symptoms of psychiatric disorders necessary to make the criteria, 
provided the technology for systematic diagnoses in epidemiologic studies. 
Epidemiologic researchers were skeptical about the ability to use these 
methods in community studies. The methods, which were first applied in 
1975 in a small community study of 500 subjects who lived in New Haven, 
Connecticut, were shown to be feasible and reliable (5). In the late 1970s, 
President Carter’s Commission on Mental Health request for data on the 
magnitude of psychiatric illness in the community for planning mental health 
programs gave impetus to the next phase of studies. 

In 1980, the National Institute of Mental Health (NIMH) Epidemiologic 
Catchment Area (ECA) study was initiated by using the Diagnostic Interview 
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Schedule (DIS), a new instrument developed specifically for large-scale 
epidemiologic studies of psychiatric disorders (6). The purpose was to collect 
data on the rates, risk factors, and treatment patterns of the major mental 
illnesses in the community. The study, which is based on over 18,000 persons 
living in five communities in the US (New Haven, St. Louis, Baltimore, 
Durham, and Los Angeles), forms the basis of our current understanding of 
the epidemiology of the major psychiatric disorders (7). 

Parallel developments were being undertaken in England (8, 9, 10). As 
knowledge of ECA grew, similar studies that used identical methodology 
were undertaken in different parts of the world (see Table 2). In many cases, 
the staff in other countries were trained in the use of the DIS by Robins and 
colleagues in St. Louis. Thus, for the first time, independent, cross-national 
comparisons of epidemiologic rates, which use data obtained with similar 
methods, is now possible. The findings from these studies for the mood 
disorders are summarized below. 


Bipolar Disorder 


The community-based lifetime rates in the US for bipolar disorder are about 
1% (range .7—1.6%). The rates are lower (about .5%) in Edmonton, Canada; 


Table 2 Lifetime prevalence rates/100 in adults aged 18+ for bipolar disorder, 
major depression, and dysthymia, based on community surveys using DIS and 
DSM-III diagnosis 








Lifetime Rates/100* 





Major 
Time N Bipolar Depression Dysthymia 








USA-ECA 1980-1983 18572 


:. 4.4 
New Haven 1980 5034 l 

1. 

1. 


5.8 
2.9 
4.4 
Ke 
5.6 
8.6 
4.6 
6.2 
3.4 


O° 
cA 
a 


Baltimore 1981 3481 
St. Louis 1981 3004 
Durham 1982 3921 
Los Angeles 1983 3132 
Edmonton 1983 3258 
Puerto Rico 1984 1551 
Florence” 1985 1000 
Seoul 1984 5100 
Taiwan 1982 11004 
Urban 5005 : 0.9 
Small towns 3004 1.7 
Rural area 2995 1.0 
New Zeland 1986 1498 12.6 


RwWUAKHADANAN 
NN F&F WwW EM WN WY WwW 





“Rates rounded off to one decimal in most cases. 
> Only annual prevalence rates reported. 
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Puerto Rico; and Seoul, Korea. Only annual rates have been reported in 
Florence, Italy, and these rates (1.3%) are similar to the US lifetime rates. 
The similarity in lifetime US and annual Florence rates may result because 
physicians are used as interviewers in Florence or because bipolar disorder is 
a chronic illness. The similarities, rather than the differences, in cross- 
national rates are notable, with the exception of Taiwan (about .1 and .6%). 
However, the rates for most psychiatric disorders, and particularly the mood 
disorders, are lower in Taiwan and in the limited unpublished data available 
from Shanghai (W. Lui 1990, personal communication). Without data from 
Chinese who live outside of the Republic of China (Taiwan) or the Peoples 
Republic of China, it is not possible to determine if this finding is unique to 
the Chinese. 

The rates of bipolar disorder are similar in men and women. There is a 
trend for increased risk in urban areas. The mean age of onset is the late teens 
and early 20s, with many onsets occurring in adolescence. There is suggestion 
of temporal changes of the rates of bipolar disorder with an increase of rates 
and an earlier age of onset in the cohorts born since 1940 (11, 12). Family 
history, although not assessed in the community surveys, remains one of the 
most important risk factors for bipolar disorder, as we discuss later. 

Marital problems and depression are often closely associated, although the 
specific type of depression has usually not been specified. For bipolar dis- 
order, the rates are highest for persons who are cohabitating, but not married; 
have a history of divorce, regardless of their current marital status; or who 
have never married. The rates are lowest in married or widowed persons 
without a history of divorce. However, assumptions about any causal relation- 
ship require caution. A break-up of a marriage may be a response by the well 
spouse to the stress of living with a depressed person, or may be brought 
about by the affected person who attributes the distress to failings in the 
spouse (13). 


Major Depression 


Major depression is considerably more prevelant than bipolar disorder and, 
unlike bipolar disorder, has higher prevalence in women than in men. There is 
also more variability in the rates by site. The lifetime rates vary between 3.5% 
and 5.8% in the US sites, including Puerto Rico (Table 2). The higher rates in 
New Haven are undoubtedly due to the use of slightly broader criteria for 
major depression of one, rather than two, weeks duration. The rates are 
highest in Edmonton and New Zealand. Again, Taiwan has low rates of 
I%-1.7%. 

Unlike Taiwan, Korea is a more westernized country, and the lifetime 
prevalence rate of major depression is 3.4%, comparable to the lower end of 
the ECA five-site range. Most non-US studies sampled more homogenous 
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communities than did the ECA five-study, which by design had diverse 
ethnic, racial, and socioeconomic sampled. Puerto Rico’s data resembled the 
ECA results, with a lifetime prevalence of 4.6%. 

Several risk factors in major depression have emerged from community 
studies. Female sex continues to be the clearest and most consistent risk factor 
in all sites listed on Table 2, as well as in family studies (14). Positive family 
history of major depression, although not assessed in community studies, is 
also a risk factor and is discussed later. The ECA found few black-white 
differences in rates of major depression, once social class and education were 
controlled. However, as noted before, the Chinese appear to have sub- 
stantially lower rates of major depression than US whites or blacks, as 
reflected in the rate in Taiwan, as well as unpublished rates from an 
epidemiologic study in Shanghai, China. Unfortunately, there are no data 
from Japan. 

There are higher rates in urban rather than rural areas (15), but not a strong 
effect for social class. As with bipolar disorder, a history of divorce or 
separation had a profound effect on increasing rates of major depression. 
More specifically, continuously married and never married people had the 
lowest rate, and divorced people the highest. 


Dysthymia 


Dysthymia is slightly less prevalent over a lifetime than major depression. In 
both Puerto Rico and Taiwan, however, lifetime risk for major depression and 
dysthymia are about equal. Dysthymia in the US, including Puerto Rico, 
ranges from 2.1% to 4.7%. The rates are considered lower in Taiwan (.9- 
1.5%). In all countries, the rate is about twice as high in females than in 
males. 

With dysthymia, the variations found with race, marital status, and urban/ 
rural areas are similar to those found for major depression. However, for 
dysthymia, unlike major depression, there is a significant inverse relationship 
with income, especially in young persons. Unlike major depression or bipolar 
disorder, the rate begins to decrease around age 45. The similarity in rates, 
risk factors, and high comorbidity between major depression and dysthymia 
(termed double depression) has raised questions about whether these are 
distinct disorders. This issue is still unresolved. 


Changing Rates of Depression 


There is reasonably consistent evidence for a change in the rates of major 
depression, with higher rates in more recent birth cohorts and an earlier age of 
onset (14). This observation was first made in the 1960s and 1970s, based on 
the following: admission at hospitals for affective illness had increased be- 
tween 1950 and 1970, as compared with the previous three decades; the 
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average age of onset for major depression in clinical samples was consider- 
ably younger than had been reported before World War II; childhood depres- 
sion was seen in increasing frequency in pediatric and psychiatric settings; an 
increase in suicide attempts and deaths among adolescents was noted; and the 
rates of major depression based on community studies in the elderly were low 
(16). However, investigations of temporal changes in the rates of depression 
and other psychiatric disorders before the mid-1980s were hampered, because 
of lack of large systematic studies that used standardized diagnostic criteria. 
Thus, it was not possible to tell whether the differences observed in rates were 
real or caused by changing methodology, treatments, or concepts. 

The ECA study in the US, as well as several large family studies of 
relatives of depressed patients that used diagnostic methods comparable to the 
ECA, suggested the following temporal changes in the rates of major depres- 
sion: an increase in the rates of the cohorts born after 1940; a decrease in the 
age of onset, with an increase in the teenage and early adult years (see Figure 
1); an increase in rates for the cohorts born between 1960 and 1975, with an 
increase in the rates of depression for all ages, but particularly among younger 
age groups in that period; a persistent gender effect with the risk of major 
depression consistently two or three times higher among women than in men; 
a persistent family effect with the risk of major depression about two to three 
times higher in the first-degree relatives of depressed patients, as compared 
with controls; and a possible narrowing of the differential risk to men and 
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Figure 1 The temporal trends (period-cohort effects) and lifetime prevalence of major depres- 
sion, from the ECA study at five sites. Includes both sexes, white only. 
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women because of a greater increase in the risk of major depression among 
young men (14, 17-20). 

These trends were noted in epidemiologic studies, as previously described, 
in the US, Germany, Canada, and New Zealand, but they were not found to 
the same extent in studies conducted in Korea or Puerto Rico. 

Various efforts have been undertaken to explain the findings and whether 
they could be artifact, because of selective mortality and/or institutionaliza- 
tion, selective migration, changing diagnostic criteria, threshold changes in 
reporting among mental health professionals and/or society at large, reporting 
bias of interviewers, and recall problems among the elderly (14). 

In disorders with familial aggregation, exclusive genetic interpretations are 
ruled out by observations of temporal changes, because genes are unlikely to 
change in a relatively short time. The environmental risk factors for depres- 
sion that have also been suggested include changes in the ratio of males to 
females; increased urbanization; greater geographic mobility, which results in 
loss of attachments; increasing social anomie; changes in family structure; 
alteration in the role of women, especially the increased number of women in 
the labor force; and shifts in occupational patterns. 

Thus far, the increase in the rate over time and by birth cohort have been 
best established for major depression. However, two independent studies 
show that the same increase may occur for bipolar disorder (11, 12). 


Future Directions in Epidemiologic Studies 


The American system of DSM-III, or its forthcoming DSM-IV, is not uni- 
versally used. A new diagnostic method, Schedules for Clinical Assessment 
in Neuropsychiatry (SCAN) is being field tested in 20 centers in 11 countries 
(21). The aim is to develop a comprehensive procedure for clinical examina- 
tion that is also capable of generating many of the categories of the /n- 
ternational Classification of Disease, \0th edition (ICD-10), as well as the 
DSM-III and IV. The Composite International Diagnostic Interview (CIDD, 
based on the DSM-III and the ICD, has also been developed (22). The two 
instruments are complementary, as the CIDI is designed for use in large 
community surveys that necessitate the employment of lay interviewers, 
whereas SCAN can only be used in its full form by clinically trained pro- 
fessionals. The availability of two diagnostic methods that bridge the major 
classification systems will facilitate future cross-national comparative studies. 

Lastly, none of the epidemiologic studies mentioned included children, 
even though mood disorders, as well as many of the major mental illnesses, 
often first occur in adolescence and, to a lesser extent, in childhood. Current- 
ly, there is field testing of diagnostic methods applicable to epidemiologic 
studies of children at Columbia, Yale, and Emory Universities and the 
University of Puerto Rico in preparation for a multisite epidemiologic study of 
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children comparable to the ECA study for adults. By the year 2000, we can 
expect that information on the epidemiology of childhood psychiatric disord- 
ers will become available. 


GENETICS 


Evidence for the role of genetic factors in bipolar disorder and, to a lesser 
extent, major depression has accumulated over the past two decades, based on 
twin, adoption, and family studies. The interest in the genetics of bipolar 
disorder has recently been stimulated by the findings of linkage between 
bipolar disorder and markers on chromosome 11 and the X chromosome (23). 
The failure to replicate these findings has been a disappointment and high- 
lights the scientific problems in applying the new genetic approaches to 
complex disorders. 

In 1990, the NIMH launched a multisite collaborative program to study the 
genetics of bipolar disorder, as well as schizophrenia and Alzheimer’s dis- 
ease. Their goal is to establish a national resource of immortalized cell lines 
and psychiatric histories in reliably diagnosed pedigrees and to collect suf- 
ficiently large samples to detect any linkage. The focus in the first year has 
been on standardizing the assessment procedures across sites. The recent 
report in Nature of a pinpoint mutation in a single gene on chromosome 21 for 
some forms of Alzheimer’s disease makes this collaborative project timely 
(24). 


Bipolar Disorder 


The selection of bipolar disorder for this first NIMH collaborative genetics 
initiative derives from the reasonably strong evidence from twin, adoption, 
and family studies for the genetic transmission of bipolar disorder. The mode 
of transmission, the spectrum of bipolar disorder, and the relationship be- 
tween bipolar disorder and major depression are unclear (11, 25, 26). With 
increasing chromosome markers to map the entire genome, perhaps more 
definitive information will become available on linkage by using restriction 
fragment length polymorphisms. The findings of linkage could well call for 
additional family and epidemiologic work to identify other factors, both 
genetic and nongenetic, that may modify the expression of a disorder. The 
concordance in monozygotic twins for bipolar disorder is much below 100%; 
thus, other factors may be operating. The findings of possible temporal 
changes in the rates of bipolar disorder imply that there are environmental 
factors in the expression of the disorder. 


Major Depression 


The phenotypic heterogeneity of major depression presents a problem for 
recombinant DNA approaches, and the search in family and clinical studies 
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has been for subtype(s) of major depression that might be homogeneous and 
possibly genetic. Twin data generally support the role of genetic factors in 
some of the depressions, particularly psychotic depression (27). These find- 
ings do not extend to all subtypes of major depression and seem less likely 
with dysthymia. However, the twin studies completed by using modern 
diagnostic criteria have been based on very small samples, so that conclusions 
about concordance cannot be drawn. A large twin study by Kendler at the 
University of Virginia, which may clarify these issues, is under way. 

Over the past decade, there have been several, well designed family 
studies, which indicates clearly that major depression is highly familial (see 
Table 3) (28). The lifetime rate of major depression in the first-degree relative 
ranges from 8.6% to 34.9%, which is usually about two- to threefold higher 
than in relatives of matched comparison groups and higher than the population 
rates. The Bland et al (29) study that had the lowest rate of major depression 
in relatives also had the oldest patients with later ages of onset. No substantial 
increase of bipolar disorder is found in the relatives of patients with major 
depression. 

There is a long history of searching for specific homogeneous subtypes of 
major depression. Some of the subtypes suggested as possibly homogeneous 
and more biological, such as endogenous or melancholic depression, have not 
been shown to have a higher familial aggregation than the nonendogenous, 
nonmelancholic subtypes. However, an increased familial aggregation and 
specificity transmission of early onset (<20 yrs.) of major depression has 
been demonstrated (30). 

Adoption studies have been inconclusive for major depression. A recent 


Table 3 Morbid risk of depression in first-degree relatives 
of unipolar depressive probands* 








Reference 


Winokur et al 1982 (62) 

Gershon et al 1982 (63) 

Baron et al 1982 (64) 

Weissman et al 1984 (65) 

Bland et al 1986 (66) 

Stancer et al 1987 (67) 

Rice 1987° (68) 

Giles et al 1988° (69) 

McGuffin et al 1988 (70) 
Outpatient treatment only 
Inpatient + outpatient treatment 

Kupfer et al 1989 (71) 
Recurrent depressive probands 


% (N) at Risk 
11.2 (305) 
16.6 (133) 
17.7 (143.5) 
18.4 (287) 
8.6 (763*) 
24.4 (282) 
28.6 (1176) 
34.9 (43) 


24.6 (199.5) 
11.8 (187) 


20.7 (725) 





*From Ref. 61a, with permission. 


> Unadjusted number and morbid risk. 
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adoption study from Stockholm did not find a higher rate of depression in the 
biological parent of adopted-away depressed offspring (31). The lack of an 
association may be due to their reliance on hospital records. Moreover, these 
data are at variance with other adoption studies of major depression (32). 


Dysthymia 


There are several family studies of dysthymia under way, and information 
will be forthcoming in the next few years. The family studies of patients with 
major depression find that dysthymia aggregates in their first-degree relatives, 
thus suggesting that dysthymia is on the spectrum of major depression. 


CHILDREN 


Whether or not a precise genetic etiology of the mood disorders is determined, 
the findings on familial aggregation have public health implications for 
children. Until recently, the conventional wisdom was that children were not 
capable of becoming depressed. In fact, before 1972, there were no textbooks 
of psychiatry that mentioned depression in children. With the increased use of 
systematic diagnostic assessments of children over the last decade, it has 
become clear that depression does occur in prepubertal children and is com- 
mon in adolescence. Moreover, the offspring of depressed parents are at 
increased risk for depression, as well as a variety of other types of social, 
school, and health problems. Several research efforts are under way to 
understand the clinical characteristics, familial aggregation, treatment, and 
course of depression in children. Although most of the studies have been on 
the children of parents with major depression (33-37), there have been some 
studies of the children of bipolar parents (38). 

In general, the findings show that major depression in a parent increases the 
risk for psychiatric disorders in children, particularly for major depression and 
anxiety disorders. In addition, the children of depressed parents, as compared 
with children of nondepressed parents, are more impaired in school and with 
peers and have higher rates of developmental and medical problems. 

Although the children of parents without psychiatric disorder also develop 
major depression, their rate is significantly lower and the age of onset may be 
later, most commonly in the midteens, and rarely prepubertally. In the one 
study that found specificity of transmission of age of onset of major depres- 
sion between parent and child, all the prepubertal depression occurred in the 
offspring of parents who had a first onset of major depression before age 20 
(33). A two-year follow-up study of the same cohort suggested that de- 
pressions that occur in the children of nondepressed parents tend to be more 
transient and milder (39). 

Based on a limited number of studies, the children of bipolar parents appear 
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to be at increased risk for psychiatric disorders and possibly cyclothymia. 
Because bipolar disorder is less common than major depression, large sam- 
ples of children may be required to show an effect. However, if the proband is 
defined as the adolescent with bipolar illness, and the rates of illness are 
assessed in the adolescent’s first-degree relatives (i.e. siblings and parents), 
then an increased risk for both major depression and bipolar illness in the 
relatives is found (40, 41). It is unclear if bipolar disorder occurs pre- 
pubertally and, if so, what the early signs are. 

A related question regarding depressed children concerns the continuity 
between childhood and adult depression. To date, there is not one longitudinal 
study that has sampled groups of children who are depressed, by using 
modern diagnostic criteria, and has followed them into adulthood. The study 
that comes closest to having the ideal design (42) used a “catch-up longitudin- 
al design” to assess adult psychiatric status and social adjustment of 52 
depressed children and adolescents, compared with 52 individually matched 
controls. The authors’ major findings were that the depressed children were at 
an increased risk for mood disorders in adult life and had elevated risk of 
psychiatric hospitalization and treatment. They were no more likely than the 
control group to have nondepressive adult psychiatric disorders. These find- 
ings strongly suggested that there was substantial specificity and continuity in 
mood disturbance between childhood and adult life. 

Information from long-term follow-up of depressed children to determine 
the continuity between childhood and adult disorders may clarify two puz- 
zling findings: depressed children do not have the same good response to the 
tricyclic antidepressants as seen in adults; and their sleep patterns and cortisol 
response during depression are somewhat different than adults, although the 
biologic studies on depressed children are limited. Better understanding of the 
nature of depression in childhood and its relationship to adult depression 
could have implications for treatment and earlier preventive intervention. 


SOCIAL MORBIDITY AND QUALITY OF LIFE 


The social morbidity and impairment of functioning in work and marriage in 
patients with major depression has been well documented for more than 20 
years. However, it is unclear how their functioning compares with patients 
who have chronic medical conditions, or whether patients with depressive 
symptoms that do not meet criteria for DSM-III disorders are impaired. These 
latter patients represent a large number who are seen in primary care and 
medical clinics, but do not come to the attention of mental health pro- 
fessionals (43). The recent Rand Case medical Medical Outcome Study (44, 
45) monitored the patterns of morbidity outcome in health care in patients 
with major depression, dysthymia, or depressive symptoms that do not meet 
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full criterion of either. They compared chronic medical conditions, including 
hypertension, diabetes, coronary artery disease, angina, arthritis, back pro- 
blems, lung problems, and gastrointestinal problems, as well as a sample of 
patients who had acute, but not chronic, medical problems (46). Of the 
11,000 patients sampled, 466 had depressive symptoms that did not meet the 
criteria of disorder, and an additional 168 met the criteria for a depressive 
disorder. These patients were monitored over two years, and an assessment 
was made of the number of days in bed, self-perceived current health status, 
and extent of body pain in the past month. 

The major finding was that, among the patients seen by medical clinicians, 
those with major depression had poorer current health status than those with 
depressive symptoms alone. Patients in the two depressive samples and the 
eight chronic medical samples were compared as to physical activities, such 
as sports, climbing stairs, walking, dressing, and bathing; normal social role 
performance at work or in the household; and social functioning with friends 
and relatives. Patients with depressive symptoms had significantly worse 
social functioning and reported significantly more days in bed than patients 
with six of the eight chronic medical conditions, the main exception being 
coronary artery disease. Depressive symptoms and current medical condi- 
tions had additive effects with regard to measures of patient functioning 
and well-being. The functioning of depressed patients was comparable to 
and, at times, worse than patients with several chronic medical conditions 
(47). 

In a separate study, excessive mortality from multiple causes was found in 
a large community sample derived from the ECA of depressives over age 55 
(48). Previous reports of mortality among depressed patients, also based on 
community samples, indicated an excess of death by suicide and accidents for 
younger depressed patients; in older age groups, suicide becomes less promi- 
nent, whereas chronic medical conditions, especially cardiovascular disease, 
provided the excess mortality. 


TREATMENT 


Pharmacologic 


Since the 1960s, there have been advances in the treatment of depression, a 
decrease in hospitalization, reduction in the duration of an episode, and 
strategies developed for prevention of relapse and recurrence. Most treatment 
for all mood disorders is now ambulatory. The tricyclic antidepressants have 
been available for more than two decades, and their therapeutic value for 
major depression was seen in the early 1960s. Only recently has there been 
sufficient experience with the range of doses and with blood-level de- 
terminations. There is excellent evidence that the symptoms of depression can 
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be reduced in two to four weeks with pharmacologic treatment, usually a 
tricyclic antidepressant. Soon after the antidepressants were introduced, 
however, investigators found that a high percentage of patients relapsed 
following short-term treatment; continuation therapy strategies were common 
in clinical practice and became the subject of several research studies. The 
goal of continuation treatments are to sustain the remission brought about by 
short-term treatment, to prevent relapse, and to facilitate social and economic 
functioning. There is no agreement as to the optimal duration of continuation 
drug treatment, although commonly six months to two years have been 
efficacious. Beyond one or two years, treatment is considered maintenance or 
prophylactic. The longest studies of drug therapy are of three years’ duration. 
These studies show clearly that maintenance treatment of lithium for bipolar 
disorder and tricyclic antidepressants for major depression will markedly 
reduce relapse rates. 

For many ‘patients, mood disorders fit the model of chronic illness, with 
periods of remission and recurrence. For many, the need for treatment beyond 
the acute phase is increasingly supported by follow-up and treatment studies. 
Over the last decade, several newer antidepressants have appeared, with a 
wide variety of chemical structures and pharmacologic profile (49). These 
drugs have been aimed at counteracting or eliminating problems with the 
original generation of antidepressants by reducing the anticholinergic effects, 
i.e. dry mouth and urinary retention, lowering cardiac toxicity and seizure 
thresholds, and/or reducing weight gain. One new drug, fluoxetine, a com- 
pound with highly specific action on serotonin reuptake, has received con- 
siderable publicity in the lay press, partially because it has few anticholinergic 
side effects and does not produce weight gain. Whether its antidepressant 
effects are remarkably different than the other available compounds is not 
fully clear. 

In regard to other new areas of pharmacologic treatment research, efforts 
are under way to test the efficacy of antidepressants in patients with dysthy- 
mia, a disorder primarily the domain of psychotherapy. The data on the 
efficacy of tricyclic antidepressants on depressed children and adolescents are 
limited and inconclusive. Data on the efficacy of tricyclic antidepressants in 
geriatric depressed patients, based on controlled clinical trials, are also lim- 
ited. However, the data that are available suggest that the usual antidepressant 
drugs in lower doses are efficacious in this population and that the side effects 
are problematic. These populations—dysthymics, depressed children, adoles- 
cents, and the elderly—clearly need further study. 

The full details for treatment of bipolar disorder, as well as our scientific 
understanding, has been recently summarized by Goodwin & Jamison (50). 
This book in itself is an achievement of this past decade, because of its 
reliance on and gleanings from empirical evidence. 
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Psychotherapy 


Over the last ten years, there has been considerable improvement in the 
quality and quantity of information on the efficacy of psychotherapy treatment 
in comparison and in combination with pharmacotherapy of adults with major 
depression. Similar data on the efficacy of psychotherapy are not available for 
bipolar disorder or dysthymia or for adolescents with major depression, 
although several clinical trials are now in the planning phase. Several short- 
term psychotherapies developed specifically for depression, particularly 
cognitive therapy (CB), interpersonal psychotherapy (IPT), and some be- 
havior approaches, have been specified in manuals. These manuals standard- 
ize the treatment and are used in the training of therapists who conduct the 
treatment in the clinical trials (51). Currently, about 20 clinical trials test the 
efficacy of these psychotherapies in homogenous samples of patients with 
major depression. Two recently published, large-scale treatment studies de- 
serve attention: the NIMH Collaborative Treatment Study (52) and the Main- 
tenance Treatment Study of Recurrent Depression (53). 


NIMH COLLABORATIVE TREATMENT STUDY In 1980, the results of several 
small clinical trials were sufficiently promising, so that the NIMH initiated 
the first multisite collaborative study of the treatment of depression to include 
psychotherapy. Based on the models used to test the efficacy of the new 
psychotropic drugs in the 1960s, this study was designed to test IPT and CB. 
In three university centers, 250 depressed patients were studied simultaneous- 
ly. Overall, the findings showed that all active treatments were superior to 
placebo in the reduction of symptoms over a 16-week period. The overall 
degree of improvement was highly significant clinically. Over two thirds of 
the patients were symptom free at the end of treatment. More patients in the 
placebo-clinical management condition dropped out or were withdrawn, twice 
as many as in the IPT group, which had the lowest attrition rate. At the end 
of 12 weeks of treatment, the two psychotherapies and imipramine were 
equivalent in the reduction of depressive symptoms and in overall func- 
tioning. Imipramine had the most rapid initial onset of action and the most 
consistent positive effect on the various symptom measure. Although many 
of the less severely depressed patients improved with all treatment condi- 
tions, including the placebo group, the more severely depressed patients 
in the placebo condition did poorly. For the less severely depressed group, 
there were no differences among the treatments. The severely ill patients in 
the IPT and imipramine groups had significantly better response (less de- 
pressive symptoms) than the placebo group (52). There has been some 
controversy about the analytic approach used in this study. Reanalysis will 
soon be forthcoming, although it is unclear if the results will change. 
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THE MAINTENANCE TREATMENT STUDY OF RECURRENT MAJOR DE- 
PRESSION Also in the early 1980s, the University of Pittsburgh group 
undertook a long-term clinical trial to determine the efficacy of drugs (imi- 
pramine) and/or IPT in the prevention of relapse for severe recurrent depres- 
sion (53). The impetus for this study was the finding that many patients with 
multiple recurrent episodes were difficult to treat, had a high relapse rate, and 
were high utilizers of medical and social services. In this study, patients with 
recurrent depression, who had responded to imipramine plus interpersonal 
psychotherapy, were randomly assigned to one of five treatments for three 
years maintenance treatment: IPT alone; IPT and placebo; IPT and imi- 
pramine; clinical management; and imipramine, clinical management, and 
placebo. Contrary to previous experience, imipramine was administered in 
the highest doses (over 200 mg), and IPT was administered monthly, in the 
lowest dose ever used in the clinical trials. There were four major findings: 
high rate of recurrence in one year for untreated control groups; clinically 
meaningful and statistically significant prevention of relapse and recurrence 
by both imipramine and IPT; a nonsignificant trend towards value of com- 
bined treatment over either treatment alone; and the value of high dose 
imipramine (over 200 mg/day) (previously considerably lower maintenance 
doses had been recommended). This long-term study, along with several 
others that used drugs with and without psychotherapy, clearly established the 
value of maintenance treatment in the prevention of relapse and recurrence in 
unipolar depression. 

Alternatives to medication as a treatment for depression are enormously 
important. For various reasons, many patients will not or cannot take drugs, 
e.g. women of child-bearing age and the elderly who often have con- 
commitant medical problems (54). 


The NIMH Depression Awareness, Recognition, and 
Treatment Program (DART) 


Emerging epidemiologic findings on prevalence and morbidity of depression 
and available treatments and numerous studies of clinical practices in primary 
care and other general medical settings have indicated that only one half of 
patients diagnosed with depressive and other psychiatric conditions are de- 
tected by general and family practitioners. In 1989, the NIMH initiated 
DART, a program of secondary prevention of depression. This program is 
comparable to the one initiated by the Heart Institute to educate the public 
about the treatment of hypertension. Initial efforts have been on educating the 
public and professionals about the availability of effective treatments for 
severe disorders, particularly bipolar and recurrent unipolar depression. In the 
early policy discussions on the focus of the DART, relatively low priority was 
given to patients with milder depressions, including those with depressive 
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symptoms seen in general medical health settings. It is still too early to know 
if the program will have an effect on improving detection and treatment of 


depression. A strong evaluation component has not been added to the pro- 
gram. 


CONCLUSION 


There has been considerable progress in understanding the epidemiology and 
familial patterns of the mood disorders and in testing new treatments. The 
challenge of the 1990s is to bridge the gap between our understanding of 
treatment efficacy and the delivery of service; to identify the genetic mech- 
anism(s), particularly for bipolar disorder; and to learn more about the 
continuity between childhood and adult depression so that appropriate in- 
terventions can occur earlier. The opportunities for secondary and tertiary 
prevention have increased. With increased knowledge of risk factors, particu- 
larly familial risk factors suggestive of genetic contribution for some of the 


mood disorders, the opportunity for primary prevention now seems less 
remote. 
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INTRODUCTION 


Social marketing is often perceived as a contradiction in terms and an odd fit 
for the public health professional. For if marketing, the business of selling 
goods and services, is pursued singlemindedly—exclusive of all other con- 
siderations but profit—it will eventually clash with the social purpose of 
public health. Yet, in less than 20 years, social marketing for health has 
emerged as a recognized practice. 

Multiple channels of mass communication and new methods of knowledge 
diffusion have touched all but the more remote and isolated communities. 
Messages aimed at influencing personal choices and decisions come from 
several sources at any given time, and often at cross purposes. Useful 
information reaches an ever larger number of people and improves prospects 
for good health. But, the same channels of information have also conveyed 
words and images harmful to health. The changing environment of com- 
munication provides an important backdrop for efforts to change attitude and 
behavior of which social marketing is an example. This article reviews the 
origin of social marketing, its practices, its strengths and weaknesses, and its 
place in the future of public health. 
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What Is Social Marketing? 


Forty years ago, Wiebe (67) asked, “Why can’t you sell brotherhood and 
rational thinking like you sell soap?” Few responded to his challenge at the 
time. The use of advertising media for a social purpose had existed for 
decades, and audiences were familiar with public service announcements 
(PSAs) and campaigns that popularized such slogans as “Uncle Sam Wants 
You.” However, marketers did not consider social causes in terms of product, 
price, and place until the 1960s and 1970s (60). By the late 1960s, such 
marketers as Richard Manoff were applying the full range of marketing 
techniques to nutrition and other health education campaigns (49). The gener- 
al heightened social consciousness at the time may well have helped initiate 
marketing’s probe into the social arena. Some advocates of change learned 
and used marketing techniques to advance their causes. 

In 1971, marketing professor Philip Kotler and his collaborator, Gerald 
Zaltman, called the application of marketing practices to nonprofit and social 
purposes “social marketing.” They described it as a “a promising framework 
for planning and implementing social change” (31). Social marketing at- 
tempts to persuade a specific audience, mainly through various media, to 
adopt an idea, a practice, a product, or all three. It is a social change 
management strategy that translates scientific findings into action programs. 
It combines elements of traditional approaches and modern communication 
and education technologies in an integrated, planned framework. 

Social marketing uses marketing’s conceptual framework of the 4 Ps: 
Product, Price, Place, and Promotion. Social marketers adopted several 
methods of commercial marketing: audience analysis and segmentation; con- 
sumer research; product conceptualization and development; message de- 
velopment and testing; directed communication; facilitation; exchange theory; 
and the use of paid agents, volunteers, and incentives. 

Audience analysis is needed to identify segments for specific approaches. 
Consumer research yields valuable data about the wants and needs of targeted 
segments and provides a basis for product design and message development. 
Testing sharpens the effectiveness of products and messages. Specific chan- 
nels appropriate to the targeted segments are chosen for product distribution 
and message dissemination. Paid and voluntary agents reinforce and facilitate 
message dissemination and product distribution by face-to-face communica- 
tion. Incentives are employed to motivate the sales force and stimulate 
consumer demand. Exchange theory illuminates the relationship between 
price and perceived benefit. 

However, there is not a universally accepted definition of legitimate social 
marketing. Such lack of consensus has contributed to misconceptions about 
the role of social marketing in public health and has probably fueled skepti- 
cism and criticism. Although the American Marketing Association has been 
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challenged to provide a standardized definition (45), the official definitive 
statement has yet to be written (36). 


REVIEW OF THE LITERATURE AND EXPERIENCES 
Literature 


Social marketers have written many articles about their experiences. Besides 
discussing and arguing for their respective definitions, they have written 
about their successes, the difficulties encountered, and the lessons to be 
learned. In addition, theorists from various fields have explained and cri- 
tiqued social marketing. The literature on social marketing now spans some 
40 years, counting Wiebe’s original challenge. But, the bulk of the written 
work is concentrated in the last 25 years and can be divided roughly into three 
periods: early theory, experiences evaluated, and increasing acceptance. 


EARLY THEORY In the late 1960s and early 1970s, theorists attempted to 
define and justify social marketing amid criticism from all sides. There were 
four central questions: What is social marketing? What is its role? Is it 
possible? Is it marketing? 

Ironically, Wiebe has seldom been given credit for his own thoughtful 
answer to his challenge: “Advertising does not move people to unilateral 
action. It moves them into interaction with social mechanisms . . . It is the 
crucial importance of the retail store, viewed as a social mechanism which 
facilitates the desired behavior, that social scientists often seem to overlook 
when they yearn for behavioral changes comparable to those achieved by 
advertisers” (67). Although Wiebe uses the word advertising, his insistence 
on an adequate and compatible social mechanism and his concept of “dis- 
tance” (the effort audience members believe the new product or behavior 
requires, compared with its benefit) indicate that he was talking about social 
marketing (the comprehensive use of marketing methods for a social cause), 
and not merely social advertising (the use of advertising media to publicize a 
social cause). 

Debate on the role of marketing for social causes began in earnest in the late 
1960s and accelerated in the 1970s, much of it in the marketing journals. 
Martin’s “An Outlandish Idea: How a Marketing Man Would Save India” (50) 
led the way, followed by numerous discussions of how marketing should 
change or broaden its concept to meet the needs of society (2, 13, 25, 29, 34, 
45). Lazer (35) proposed that marketing’s responsibility was only partially 
fulfilled through economic processes, whereas Dawson (8) and Lavidge (34) 
predicted the new question for marketers would soon be whether the product 
or service should be sold at all. Kotler & Levy (28) proposed “demarketing” 
to reduce demand for certain products. Against this backdrop of questioning 
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and redefinition within the marketing field, Kotler & Zaltman (31) proposed 
social marketing as an approach to planned social change and outlined its 
essential features. 

Not everyone greeted marketing’s expanded role with enthusiasm. Luck 
(45) objected that replacing a tangible product with a complex bundle of ideas 
and practices overextended the exchange-of-value concept, which even social 
marketing proponents agreed was at the heart of the marketing discipline. 
Takas (64) noted that the ongoing debate about social marketing was un- 
known or ignored by most of the business community, for whom the essential 
concern remained sales for profit. Nevertheless, the new ideas took hold, and, 
by 1973, several reports and case studies of social marketing projects began to 
appear in the literature (9, 36). 


EXPERIENCES EVALUATED In the late 1970s and early 1980s, while theo- 
rists wrangled, practitioners eagerly applied the new approach to several 
fields, notably family planning, and asked, Does it work? How does it work? 
What are the constraints? 

During this period, many theorists turned their attention away from the 
debate over definitions and toward the growing mound of data from social 
marketing efforts (27, 43, 44, 55). Books and articles that explained the social 
marketing process and gave guidelines for the practitioner included Kotler’s 
Marketing for Nonprofit Organizations (26); Manoff’s Social Marketing: A 
New Imperative for Public Health (49); applications to specific fields, such as 
nutrition (24); and studies of strategy mix, channels, and evaluation (1, 4, 
59). 

In 1980, Fox & Kotler (15) described the evolution of social advertising 
into social communication and social marketing. Social marketing added four 
elements to social communication: marketing research, product development, 
use of incentives, and facilitation. However, objective evaluation was lack- 
ing. For example, Bloom (4) deplored the tendency of projects to use “after 
only” or “before and after” studies with no control group, a practice that might 
identify ineffective programs, but could not show causal relationships be- 
tween program and outcome. Theorists also gave increased attention to the 
conditions in which social marketing efforts were most successful and to the 
constraints and difficulties likely to be encountered. 

Contraceptive social marketing provided early, well-documented suc- 
cesses. Population Reports (61) summarized the results of 30 contraceptive 
social marketing projects in 27 countries, with a lengthy bibliography. The 
report concluded that social marketing was successful in providing protection 
against unwanted pregnancies at a lower cost than most other approaches. 
Nevertheless, parallels between commercial and social marketing were im- 
perfect. Rothschild (54), for example, identified problematic differences with 
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regard to product, price, segmentation, and, especially, the construct of 
involvement. He suggested that the public’s involvement with social causes 
may be bimodal (very high or very low), whereas public involvement with 
consumer goods is typically middle-range, thus making the promotional tools 
used for marketing commercial consumer goods inadequate for social tasks. 

Bloom & Novelli (5) produced a litany of problems that marketers faced in 
the public health arena. They cited the following difficulties: obtaining con- 
sumer research and data, especially behavioral data; sorting the relative 
influence of determinants of behavior; classifying and narrow-targeting seg- 
ments; formulating and shaping simple product concepts; pricing; choosing 
channels and designing appeals; pretesting methods and materials; im- 
plementing long-term positioning strategies; and ignoring those segments 
most vulnerable and often most negatively oriented to the message. Organiza- 
tional problems included poor understanding of marketing activities; treat- 
ment of plans as archival, rather than action, documents; and “institutional 
amnesia.” 

Further problems occur because, rather than encouraging people to do 
something, as commercial marketers do, social marketers must often dis- 
courage behaviors that may be attractive to the audience or deeply ingrained. 
Solomon (62) concluded that “marketing concepts cannot be applied 
wholesale to social campaigns without a great deal of thought and sensitiv- 
ity.” A veteran marketer has said, “It’s a thousand times harder to do social 
marketing than packaged goods marketing” (15). Social marketing finished its 
first decade with cautious optimism, a more realistic estimation of both its 
limits and its potential. 


INCREASING ACCEPTANCE By the late 1980s, social marketing had become 
an accepted practice, while taking some surprising new forms. However, 
fundamental questions still have not been answered: Does social marketing 
deliver what it promises? What is the impact of connecting marketing and 
social causes? What effect does it have at the sustainable behavior level? Is it 
cost-effective? Is it ethical? 

Since the late 1980s, there have been more publications to guide the social 
marketer, including a comprehensive text by Kotler & Roberto (30). Lefebvre 
& Flora (37) reviewed the social marketing field from the perspective of 
health promotion/education. They cited the orientation to consumer needs as 
social marketing’s most important contribution, despite such barriers as the 
propensity of public health programs to be “expert-driven.” They concluded 
that although not a panacea, “health marketing has the potential of reaching 
the largest possible group of people at the least cost with the most effective, 
consumer-satisfying program,” if practitioners thoroughly understand its con- 
cepts and limitations and have mastered its skills. Although there has been a 
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broader acceptance of marketing principles in many health spheres, some 
remain critical of social marketing’s ethical dimensions, its impact, and its 
capacity to deliver what it promises. 


Concerns about Social Marketing 


ETHICAL ISSUES Questions about the ethics of social marketing surfaced 
soon after the concept was introduced. As early as 1979, Laczniak et al (32) 
polled more than 300 experts, such as professors of ethics, psychology, and 
economics and marketing practitioners, and found a wide range of ethical 
concerns. Some respondents feared that marketers were getting in over their 
heads, by acquiring social power without a full sense of the issues or their 
responsibility. In the words of one, “social marketing could ultimately operate 
as a form of thought control by the economically powerful.” Marketers were, 
in general, more favorable toward the new discipline, but they too had their 
concerns. Some feared that the public would associate marketing with con- 
troversial causes and, thus, perceive them as “neopropagandists” (that is, the 
field of marketing would suffer from the taint of social causes). This is a 
surprising assertion, because the shoe is usually on the other foot in debates 
about the marketing of causes. Laczniak et al found general concern that 
social marketing would likely operate without any control and regulation, in 
contrast with health education, whose professional associations gave serious 
attention to self-imposed ethical codes. 

Because advertising is a key component of marketing, the debate on the 
ethical aspects of advertising has some bearing on social marketing. Some 
feel that the negative aspects of advertising outweigh the benefits of a social 
marketing campaign, no matter how noble the cause. Pollay (53) reported the 
consensus of 50 noted humanities and social science scholars: Advertising’s 
effect, among other things, is to trivialize real experience and engender 
materialism, cynicism, anxiety, disrespect for age and tradition, loss of 
self-esteem, and a preoccupation with sex and competition. Holbrook (20) 
responded that advertising is a mirror of societal norms, which reflects many 
wholesome values, such as family affection, generosity, patriotism, positive 
anticipation, and joy. These opposite points of view probably stem from 
different assessments of the merit of the consumer society and its capacity to 
provide human fulfillment. 

Health educators also expressed ethical concerns about the new discipline. 
Some concerns related to the concept of victim-blaming and the debate about 
persuasion versus coercion, current in the 1970s and 1980s (11, 12, 18, 51, 
57, 68). Victim-blaming occurs when individuals are held responsible for 
their problems, thus obscuring institutional and societal forces over which 
they may have little control (for example, economic status, working con- 
ditions, public policies, and laws). Marketing efforts usually address in- 
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dividuals and encourage individual behavior change, thus implicitly holding 
individuals responsible for the solutions to problems. 

It can also be argued, however, that social marketing is a tool, like the 
telephone, which can be used for a positive end, such as fostering human 
interaction, or for a negative purpose, such as obscene calls. In this review, 
we regard social marketing as an instrument, but the ethical dimensions of 
social marketing clearly deserve continuing attention. 


DISEMPOWERMENT In addition to ethical concerns, social marketing has 
been criticized as ineffectual or even counterproductive. For instance, Werner 
(56) criticized social marketing’s emphasis on commercial products, by 
claiming that it is at odds with the philosophy of community empowerment. 
Werner alleged that oral rehydration solution (ORS) manufacturers, both 
private and government, were reluctant to accept a cereal-based ORS for fear 
of encouraging home-based mixes. According to this view, even the selling of 
ORS creates dependency and detracts from the empowering knowledge of the 
principle of treating diarrhea. 

Social marketing has also been criticized for reaching the wrong audiences. 
Luthra (46) pointed out that in Bangladesh, mass media channels, such as 
television and the press, are primarily accessible to men and the urban elite. 


She argues that a literacy rate of 16% among women makes instructional 
billboards and newspapers useless for most mothers. Furthermore, important 
information about contraceptive use and side effects was not made available 
in a form appropriate to the target audience until after sales decreased because 
of user dissatisfaction. Luthra concluded that social marketing is not respon- 
sive to the needs and concerns of the user, but is driven by marketing and 
sales signals defined by Western commercial marketing practice. 


THE COMMERCIALIZATION OF HEALTH INFORMATION In the 1980s, with 
the general ascendancy of supply side economics and the popular 
acknowledgment of the success of market mechanisms during the latter half of 
the decade, the bias against commercialism subsided. Commercial terms 
gained increasing acceptance, even in countries where the economies had 
long been centrally planned. Public health services became “products,” peo- 
ple became “clients” and “consumers,” and organizations with a product to 
distribute became “vendors.” The decade saw a marked growth in the practice 
of social marketing for health, as well as health-related commercial marketing 
and cause-related marketing. 

Health-related commercial marketing emerged in the late 1980s, when the 
Kellogg Company cited National Cancer Institute (NCI) findings in marketing 
its high-fiber All-Bran cereal. Kellogg “educated” the public, while increas- 
ing its market share from 36% to 42%; thus, it started a major marketing 
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trend (16). Kellogg claimed that after the campaign, over 90% of Americans 
knew the fiber-cancer message and had heard it an average of 35 times. The 
educational aspects of the campaign were questioned by Levy & Stokes (38), 
however, because the benefits did not generalize to other high-fiber cereals 
until those companies mounted their own cancer education/marketing cam- 
paigns. Although nonprofit sources generally enjoy greater credibility than 
profit sources, the Kellogg-NCI combination is perceived as almost as cred- 
ible as the nonprofit source alone (19). Thus, Kellogg may have raised its 
credibility, while NCI gained greater exposure at no cost, as a result of what 
Freimuth et al (16) called “seductive” collaboration. 

Cause-related marketing is a similar commercial/social marketing blend. In 
this strategy, corporations donate a percentage of their profits to a cause, thus 
lending marketing expertise and support to a cause, while enhancing their 
images and making profits. In the early 1970s, for example, the US Com- 
mittee for the United Nations International Children Emergency Fund (UN- 
ICEF) cooperated with several companies that announced in their marketing 
efforts their support to UNICEF, thus tying the amount of their contributions 
to the volume of sales of their products. Caesar (7) describes other examples, 
such as American Express’ pledge to donate one cent to the Statue of Liberty 
renovation fund each time its card was used. During that period, American 
Express raised $1.7 million for the renovation project, while increasing the 
use of its cards by 30%. Studies to measure impact for the corporate sponsors, 
as well as for public health, are needed (16). 

As the 1990s begin, our review of the literature shows that social marketing 
has become more pervasive in public health. Although some complain that it 
is often adopted piecemeal and without a system of operational procedures, it 
has arrived at the end of its second decade with a measure of maturity 


generally considered a useful practice, but still not fully understood by many 
health professionals. 


Examples of Social Marketing from Developing Countries 


Although marketing is deeply rooted in business practice in the United States 
and other developed countries, the deliberate practice of marketing for public 
health has found its most complete expression in the less developed countries. 
Various social marketing activities have been undertaken for nutrition, family 
planning, and other public health projects in Asia since the late 1960s and 
early 1970s; subsequently, these activities were extended to Africa, Latin 
America, and the Middle East. Public health problems in the developing 
nations are so large and urgent that both immediate actions and innovative 
approaches are required. For the adoption of public health marketing practices 
in developing countries, it is fortuitous that the few modern mass media 
available are usually government owned and operated and, therefore, more 
obliged in principle to devote time to social development activities. The 
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overwhelming and sometimes monopolistic power of these centralized media 
was evident to public health professionals. The family planning pioneers in 
developing countries knew that their cause was controversial and were eager 
to argue their case in various public fora, particularly through the media. 
Thus, family planning has often led the way in innovative communication 
strategies, including social marketing techniques. 

Nine illustrative projects, chosen for their variety of subjects, approaches, 
and geographic representation, have been divided into three groups (see Table 
1). The information is based on documents and reports provided by the 
institutions responsible for the projects. The descriptions are necessarily brief. 
Readers are encouraged to refer to the sources and institutions cited for more 
complete information, including statistical data. 

The above-mentioned standard social marketing procedures were used in 
these projects, except where particular techniques are mentioned. These 
examples do not yet demonstrate long-term impact on behavioral change; 
therefore, the cost of behavioral change is not available. More evaluations and 
studies are needed to determine cost-effectiveness. 


TANGIBLE PRODUCTS A diarrheal disease control program in Egypt 
achieved impressive results. In December 1984, one year into the campaign, 
approximately 90% of the mothers surveyed recognized the dangers of de- 


Table 1 Examples of social marketing 








Program Organizations Involved 





Tangible Products 


Egypt—National Control of Diarrheal Diseases John Snow Public Health Group 1983-1988 
Project (NCDDP) 
Dominican Republic-Contraceptive Social Futures Group, AED, Doremus, 1984-1989 
Marketing Porter & Novelli, John Short 
Associates 
Bangladesh—Contraceptive Social Marketing Population Services International, 1974-1987 
Manoff Int. 
Kenya—Condom Promotion Population Services International 1972-1974 


Sustained Health Practices 
Cameroon—Weaning Project CARE, Manoff Int., Educational 1985-1989 
Development Center 
Indonesia—Weaning Project Manoff International 1984-1989 
Malaysia—PEMADAM Dadah/Drug Prevention Government of Malaysia 1976—present 


Services Utilization 


Colombia—National Vaccination Crusade UNICEF, WHO, PAHO 1984-1994 
Philippines—Expanded Program Immunization HealthCOM, AED 1984 








350 LING ET AL 


hydration, compared with 32% in May 1983; 95% knew of oral rehydration 
therapy (ORT); and, among those who used ORT in 1984, approximately 
60% mixed the solution correctly, compared with 25% in 1983 (58). 

In the Dominican Republic, contraceptive social marketing implemented 
by Profamilia, a local family planning association, achieved its objectives: 
increased availability of Microgynon birth control pills, increased use among 
lower socioeconomic women, increased contraceptive prevalence, and in- 
creased involvement from the private sector with consequent expanded mar- 
ket outlets. In collaboration with a private sector orals manufacturer, Pro- 
familia reduced the price of Microgynon by 50% and sold the oral under a 
new logo. In a five-year period, Profamilia generated enough sales revenue to 
recover all operating costs and become self-sufficient. Microgynon purchas- 
ers represented an expanded market (34% new acceptors), as well as brand 
switchers already in the commercial market (66%). Some 89% of the clients 
surveyed planned to continue using Microgynon (17, 63). Equally im- 
pressive, however, is the overall trend in the total orals market. During the 
five-year period, the contraceptive social marketing program contributed to a 
30% increase in the total orals market, without eroding the market shares of 
other leading orals manufacturers. 

Bangladesh is acclaimed as having one of the most successful contraceptive 
social marketing projects. In one decade, the program sold over 130 million 
condoms and over 2.2 million cycles of oral contraception. In 1984, the 
project served 40% of all contraceptive acceptors (many being rural) by 
selling low-cost products through retail and wholesale outlets. Qualitative 
research techniques, such as focus group discussions and in-depth interviews, 
were used to identify the major resistance points to using contraception. 
Investigators concluded that men should be the primary target audience of the 
media program, because they were the most resistant, ignorant, and unwilling 
to consider family planning. Research concerning current users confirmed 
that husbands were an important source of instruction. Fourteen months after 
the radio portion of the campaign began, the number of persons who believed 
that modern family planning methods are unsafe decreased and interpersonal 
discussions about family planning and recognition of the personal economic 
benefits of family planning increased. Contraceptive social marketing efforts 
in Bangladesh drew attention to both the private and public sectors, expanded 
the market, and used indigenous institutions in program planning, operation, 
and evaluation (33, 46, 58). 

Through mass media in Kenya, social marketing emphasized the quality 
image of Kinga condom, reflected in product design, package, and moderate 
cost. Commercial shopkeepers and a mobile sales team were used as condom 
distribution channels and proved effective in extending accessibility to rural 
areas. The promotional campaign had a significant impact on contraceptive 
practice. Current method users among survey respondents rose from 21% to 
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35% in one year, whereas the control group showed little change. In addi- 
tion to promoting sales, the campaign created a high level of brand aware- 
ness. After six months of marketing, 85% of male survey respondents were 
aware of Kinga condoms. Of those who had heard of Kinga, 80% were 
able to describe its purpose as a contraceptive, rather than as a venereal 
disease prophylactic, a function that was deliberately included in the 
campaign messages. Before being educated about Kinga, only 23% of sur- 
vey respondents spontaneously mentioned condoms when discussing 
contraceptive methods. Six months into the campaign, this figure had 
risen to 57% (3). 


SUSTAINED HEALTH PRACTICES The Cameroon Weaning Project indicates 
that social marketing techniques can be successful in strengthening com- 
munity-based health education in remote areas. The Cameroon Project pro- 
vided a unique opportunity to employ social marketing under extremely tough 
conditions, because of the limited resources of the implementing agency (a 
private voluntary organization) and the difficult social and ecological environ- 
ment. Despite the difficult circumstances under which the program was 
undertaken, moderate gains were demonstrated. Improved skills of CARE 
staff in conducting quantitative and qualitative research, applying appropriate 
communications skills, and disseminating simplified information improved 
knowledge levels and infant feeding practices among illiterate, rural mothers 
(21). 

The Indonesia Weaning Project was designed to develop low-cost, nutri- 
tionally-sound, sustainable solutions to reduce weaning problems. In addition 
to radio, posters, and recipe leaflets, community leaders and health workers 
channeled nutrition education to mothers. Evaluation using control and case 
groups showed that knowledge of weaning methods, nutritionally-sound feed- 
ing practice, and child growth increased most among communities that also 
received face-to-face communication from health workers (47, 48), thus 
showing the importance of a marketing approach, rather than a media-based 
advertising campaign. 

The Dadah/drug prevention program (PEMADAM) in Malaysia is ex- 
ceptional, because it markets social policies. The comprehensive campaign 
combines marketing principles and other strategies, such as community and 
national-level involvement, in a broad approach to drug prevention education 
that aims to make drug abuse socially unacceptable. PEMADAM is attempt- 
ing to instill societal principles through social marketing, aimed at linking an 
understanding of human behavior with effective social planning at a time 
when social issues are critical (65, 71). 


SERVICE UTILIZATION In Colombia, the drive for universal child im- 
munization combined communication and marketing strategies, mobilization 
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of political will and support from various sectors of society, and deployment 
of volunteers at a grass-roots level. Local leaders and health promoters were 
influential in disseminating information in the community through home 
visits. The strategy of bringing demand in contact with the service, which 
they called channeling, helped increase immunization coverage from 20% in 
1979 to 60% among children under one and 80% under four in 1984 (23). The 
experience prompted UNICEF to institute a broad approach to its immuniza- 
tion programs in other parts of the world. 

In the Philippines immunization project, mass media motivated mothers to 
bring their children to the clinic for face-to-face education. This strategy is 
progressing towards the goal of 85% immunization coverage by 1993. In a 
1990 survey sample, computed coverage among 12—23-month-old children 
was 64%. Although the coverage effect has been moderate, a substantial 
effect in timeliness of coverage has been observed. The percentage of children 
who completed the entire series of vaccinations before their first birthday 
increased from 32.2% to 56.2% within one year. In addition, a significant 
improvement in client knowledge, especially concerning the logistics of 
vaccination, was noted. Mobilized national support has been responsible for 
much of the success to date (6). 

In these illustrative cases, social marketing has been effective in increasing 
acceptance of tangible products, such as the condom and the ORS packet. To 


change health practices, however, social marketing needs to be part of a 
broader strategy that includes linkages with service delivery, skills learning, 
and community education. If the goal is sustained behavior change, and if the 
change has structural implications, social marketing per se has less impact. 


Views from Practitioners 


For this review, we contacted 15 practitioners for a modified Delphi inquiry. 
The five who responded corroborated the findings in the example projects of 
tangible products. Family planning projects have found social marketing 
particularly effective in getting their products accepted. Contraceptive social 
marketing programs are providing protection to over 8 million couples in the 
developing countries, which represents 1 .5—2 million births avoided annually, 
or about a 2% decrement in annual world population growth (P. D. Harvey, 
Population Services International). 

To measure impact, the quantity of each contraceptive sold is converted 
into couple years of protection (CYPs). A survey of 63 family planning 
projects, which marketed contraception and sterilization in ten developing 
countries, found that the cost of providing CYPs was $2-6 per year, signifi- 
cantly lower than other methods of delivering family planning in the countries 
studied (22). However, measures other than cost, such as pre- and post- 
surveys of population, should be used to evaluate impact, because distribution 
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or sales of contraceptives does not always mean that the devices are effective- 
ly used (J. Rimon, Population Communication Services). 

As a result of its effectiveness in marketing tangible products, some 
practitioners now plan to use social marketing to promote other products, such 
as Vitamin A supplements against xerophthalmia, antimalarial drugs, and 
prophylaxis and treatment for sexually transmitted diseases (P. D. Harvey). 

Some practitioners raised concern over the high cost of mounting a social 
marketing project. If a large proportion of budgets is spent on advertising and 
packaging, mostly at full price, then social marketing projects are hardly 
sustainable (J. Rimon). In such cases, dependence upon external subsidies 
and technical assistance must continue. 

Practitioners also expressed concerns about cost and accessibility. One 
prominent practitioner argued that social marketing family planning projects 
provide services that are not patronizing and do not undermine the dignity of 
recipients, because the products are purchased through an essentially neutral 
market system in which virtually all groups participate (P. D. Harvey). 
However, even subsidized products, such as the ORS packet, can come close 
to a day’s wage in many countries. If products that must be purchased are the 
sole focus of a social marketing project, certain segments of the population, 
usually the poorest, will be excluded. Thus, numerous approaches are needed 
to achieve coverage, including health education, communication, training, 
and social marketing with differential pricing targeted for various population 
segments to achieve coverage (56). 

Several practitioners urged stricter professional standards, such as greater 
rigor in segmenting audiences and tailoring messages for more impact on 
behavior. These standards would require an accommodation between the 
marketing perspective, which targets segments most likely to change, and the 
public health/epidemiological perspective, which is typically concerned with 
the poorest, highest risk, and least accessible populations (M. Rasmuson, 
Academy for Educational Development). 

Interpersonal communication strategies are important. The Stanford Three- 
Community Heart Disease Study and the subsequent Five City Project re- 
ported that quality media campaigns can inform, motivate, and produce 
changes, but face-to-face communication is needed for skill-building, 
monitoring, and feedback (14, 37). Though often cited as a project that 
included marketing, the Three-Community Study did not consciously employ 
marketing strategies at the time (N. Maccoby, Stanford Center for Research in 
Disease Prevention). 

When social marketing first appeared, enthusiastic supporters thought it 
might solve many public health problems. However, practitioners, while 
arguing for its effective use, have been cautious about its impact and aware of 
requisite conditions. More rigorous analysis and objective evaluation of social 
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marketing projects would clearly help validate the effectiveness and cost- 
effectiveness of the practice. 


STRENGTHS AND WEAKNESSES 
Strengths of the Social Marketing Approach in Public Health 


KNOWING THE AUDIENCE Social marketing has had a beneficial impact on 
how the public health sector educates the public and persuades communities 
and individuals to adopt healthy practices. With its emphasis on clients, social 
marketing has sharpened the focus on the public. It has brought more preci- 
sion to audience analysis and segmentation. In addition to demographics, 
psychographic data (attitudes, preferences, personality traits) and social struc- 
ture data (church, worksite, family) are increasingly seen as vital in designing 
projects. These data provide critical information for the formulation of better 
targeted and more effective messages, thus leading to more appropriate 
message design, more effective delivery, and, above all, better reception by 
the public, the ultimate beneficiaries of public health measures. 


SYSTEMATIC USE OF QUALITATIVE METHODS Marketers are diligent users 
of focus groups and other qualitative research methods, which add insight to 
the quantitative information gathered by such instruments as questionnaires. 
Health educators have long used group discussion primarily to resolve com- 
munity issues. But, their more recent use of focus groups to obtain customers’ 
views of their campaigns and products and to pretest messages reveals the 
positive influence of marketing. 


USE OF INCENTIVES Social marketers make deliberate and systematic use 
of incentives and special promotion efforts, such as contests and com- 
petitions, which use rewards to draw clients to the market place. This method 
was not a regular feature of the motivational efforts of public health projects 
in the past. Purists might consider any offer of reward a kind of bribery, but 
the competition for attention in the midst of the exploding commercial clutter 
has made it an acceptable practice. 


CLOSER MONITORING Most public health projects pay insufficient attention 
to monitoring and often neglect management. Social marketers are committed 
to close tracking of progress, an important management principle. 


STRATEGIC USE OF MASS MEDIA _ Social marketers use of mass media in 
delivering messages to specific audiences to create awareness or foster and 
reinforce certain health practices contrasts sharply with the media outreach of 
the majority of public health projects. Marketing projects, which usually 
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include intensive and prolonged use of broadcast media, purchase air time 
slots specifically aimed at targeted audiences, whereas underfunded public 
health projects often depend on the largess of the media for free air time. In 
the latter situation, it is the media program directors who, as an obligation to a 
good cause, decide which PSAs to air and when. When PSAs are broadcast 
during slack hours once or twice a month, they can hardly be expected to have 
the same impact as a systematic, well-targeted media campaign. 


REALISTIC EXPECTATIONS Although risk-taking is part of the commercial 
world, entrepreneurs do not take on impossible odds and would refuse any 
hopeless venture. Social marketers follow that tradition. In public health, 
however, officials are too often asked to undertake a $10,000/5-person job 
with $500 and one person. Such doomed projects erode credibility, which, in 
turn, hurts public health’s standing in its competition with other development 
priorities. Social marketing cannot help but improve the chances of public 
health programs through more realistic estimations of the requirements for 
success. 


ASPIRING TO HIGH STANDARDS _ Just as important, social marketing, with 
its roots in the commercial world, often aspires to attain the best information 
materials and talent. This has alerted many public health professionals who 
have all too often been compelled to accept second rate work as a result of 
perennial budgetary constraints. 


RECOGNITION OF PRICE Operating from the conceptual framework of the 4 
Ps, marketers accept that there is a price for any new product or behavior even 
in a voluntary exchange, although not necessarily in monetary terms. Public 
health professionals have only recently accepted that cost comes in many 
forms, such as inconvenience, opportunity costs, and incongruence with local 
culture. The notion, if it is good for you, you must want it, still lingers in the 
health field, but social marketers do not make such an assumption. In fact, 
marketers ask, “How can we make people want it?” 


Weaknesses and Negative Aspects of Social Marketing in 
Public Health 


TIME, MONEY, AND HUMAN REQUIREMENTS Marketing practices require 
a heavy investment of time, money, and human resources that many public 
health agencies cannot afford. However well designed a project may be, 
without proper financing and staff, it will not succeed. A special event to 
generate support and promote a health practice requires careful preparation 
and implementation; it cannot be handled by volunteers alone. 
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Social marketing will continue to run into bureaucratic obstacles, such as 
unrealistic time frames, inadequate funding, and understaffing. Because gov- 
ernments are principal players in public health, especially in the developing 
countries, many of these bureaucratic constraints will not go away. Social 
marketing practitioners should develop innovative ways to overcome these 
obstacles and adapt themselves to the realities of development and the con- 
straints of the bureaucratic environment. Otherwise, social marketing may 
face a gradual diminution of its role in public health. 


MARKETING ELEMENTS MISSING Marketing is part of a commercial enter- 
prise with many elements, some of which are missing in the public health 
arena. Checking the requirements of commercial marketing against the reali- 
ties of social marketing for public health programs is a good way to identify 
inherent problems. The commercial equation typically includes research for 
new products; market surveys of public interest in potential new products; 
manufacture of products with quality control; the dynamic price-product-need 
triangle and the interaction with wholesale and retail networks to get products 
distributed and made accessible; commissions and/or bonuses to motivate 
sales force; dismissal of incompetents; bankruptcies for mismanagement; 
dividends for share holders; and government regulatory oversight. Any one of 
these elements affects the others, as each serves as a check and balance for the 
entire enterprise. Too often, several of these elements are missing in public 
health initiatives. 

Perhaps the four most intractable obstacles to the success of social market- 
ing in public health are aspects of the 4 Ps (52). Public health does not have 
the flexibility to adjust products and services to clients’ interests and prefer- 
ences. Commercial companies often drop a product line when products prove 
unpopular. It is more difficult to discontinue a needed public health service. 
In social marketing, price, or the clients’ assessment of the cost of the service 
or product, may include such factors as travel time, effort expended, physical 
discomfort, and the social consequences of innovative behavior, which may 
transgress taboos, norms, or the client’s perception of his or her ability to 
change. For example, the cost in terms of effort and inconvenience for rural 
women to take their children to be immunized is the enemy of many im- 
munization programs. Although a network of retail points at convenient 
locations is a sine qua non for any successful commercial marketing effort, 
there is a limit on the number of places at which public health products are 
available. Behavior change through social marketing requires the commit- 
ment to a sustained promotional effort. However, few public health projects 
have the resources to support prolonged promotion activities. 


THE DEATH OF PSAs AND OTHER FREE SERVICES? Social marketers’ 
practice of buying air time may have a serious negative impact on the future of 
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PSAs. The health sector has depended on broadcasting services to give free 
air time to PSAs. In many countries, once the broadcast media have been paid 
to air public health spots, they are no longer willing to give free air time to 
health (39). The same could be true for the print media. Paying for time and 
space creates a serious problem for the tradition of free promotion for public 
health. Efforts to influence public service air time policies may be a good 
starting place to tackle this issue. 


A BROADER APPROACH 
Allied Practices 


Because social marketing is not the only practice in the field of social change, 
it is useful to touch upon the allied practices of development communication, 
health education and promotion, and public relations. Although they have 
different starting points, and each has developed a theoretical framework for 
its methods of work, they all encourage people to change attitudes and 
behavior and facilitate the adoption of new behavior. They all tend to have an 
eclectic approach and have benefited variously from psychology, anthropolo- 
gy, and sociology. Indeed, each of these practices is quick to incorporate that 
which it perceives to be of value. 

Development communication specialists are concerned with interpersonal, 
group, and mediated communication. Many of them come from a background 
of mass communication; others have their roots in interpersonal communica- 
tion. Both groups stress the importance of the two-way dialogue, especially 
when working with communities, which emphasizes the critical importance of 
meeting people’s felt needs. Development communication strategies now 
include education and social marketing elements. 

For decades, health education has championed the principle of community 
involvement. Health educators are expected to put the interest of the commu- 
nity first in designing any project. They consider communication a skill and 
marketing a tool. Health educators also emphasize understanding the various 
determinants of health behavior. Health education students are now required 
to take communication and social marketing courses as part of their training. 
As an example of this dovetailing of disciplines, the World Health Organiza- 
tion’s (WHO) expert committee on new approaches to health education in 
primary health care urged health education practitioners in 1982 to adopt a 
people-oriented approach. The committee also called for strengthening the 
communication skills of health education specialists (70). 

Public relations began as a way to improve public perception for in- 
stitutions and individuals. Many of its early practitioners came from journal- 
ism. Through evolution, it now encompasses media outreach, special events, 
in-house communication, and community education. Many universities that 
grant public relations degrees now offer courses in communication, advertis- 
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ing, journalism, and marketing. Public relations specialists not only project 
their institution’s or client’s views to the public, but also reflect the public’s 
interest and perception in their feedback to employers and help devise policies 
and strategies beneficial to both their employers and the public. Both the 
health education specialists and public relations practitioners in the US have 
recently introduced an accreditation program to ensure professional and ethi- 
cal standards among their ranks. 

Social marketing and its allied practices all claim that their respective 
approaches are comprehensive. They most certainly have overlapping claims 
and methodologies. All of them require their practitioners to analyze au- 
diences, design tailored messages to suit specific segments, and pretest 
approaches and materials. They all work with mass media, stress operations 
research and data collection, facilitate behavior change, practice empathy, 
orient themselves closely to their audiences, and recognize the principles of 
involvement and empowerment (40). 


A Prospective Look 


With the advent of the lifestyle illnesses, social marketing, which depends 
heavily on media, is likely to play a bigger role in public health. Lifestyle 
illnesses, such as cancer, heart diseases, psychosocial disorders, malnutrition 
and overnutrition, accidents, and sexually transmitted diseases, are, in fact, 
transmittable by the impact of words and images on lifestyle. Similarly, 
words and images are needed to combat them. With the explosion of human 
interaction and communication, abetted by more than 400 million annual 
travelers in recent years, these diseases need to be approached not as 
noncommunicable diseases, as most are currently classified, but as new 
“communicable” diseases. With its disciplined approach to mass media work, 
social marketing can and should play a useful role in combating these new 
communicable diseases (39). 

Since social marketing carved out its niche in public health in the 1970s and 
1980s, many health professionals and development specialists have realized 
that social change is a complex and challenging process. Health behavior 
cannot be separated from such issues as policy; economic and social circum- 
stances; personal attitudes; political and religious allegiances; societal norms; 
and the entrenched interests of businesses, institutions, and certain pro- 
fessional groups. Increasingly, health and development specialists are 
advocating a broader look at these problems and tackling them in a more 
comprehensive way. At the international level, WHO and UNICEF, now 
support a broader approach to change. 


WHO’S HEALTH PROMOTION The World Health Organization recently 
called for action in health promotion, a broader version of health education, 





SOCIAL MARKETING 359 


which includes advocacy for health supportive laws and public policies, 
intersectoral solidarity, alliances with various social institutions, partnership 
with mass media, and grass-roots education strategies to empower people for 
health action. Dr. Hiroshi Nakajima, Director General of WHO, has said: 
“ . . . Health is a product of social action . . . Active community participation 
and supportive social policies are necessary for progress” (69). The evolving 
WHO concept encompasses lifestyles and other social, economic, environ- 
mental, and personal factors conducive to health. 


UNICEF’S SOCIAL MOBILIZATION In launching its Child Survival and 
Development initiative 1983, UNICEF has found it necessary to mo- 
bilize various societal sectors for several inexpensive interventions to save 
millions of lives. Social mobilization (SOCMOB), as this multisectoral effort 
is called, is a process that seeks to facilitate and enhance the approach to 
development issues that aims at “going to scale,” from a micro level up to 
national scale. 

Social mobilization enables national governments and development assis- 
tance agencies to move beyond the project phase of many development 
progams. It first aims to create the political will for constructive change and 
then to translate that will into the establishment of viable social service 
policies and actions to meet basic needs. 

A continuum of mutually reinforcing, well-researched, carefully targeted, 
rigorously implemented activities is required for the mobilization process. 
The umbrella of SOCMOB covers advocacy, marketing, media, training, 
community education, and grass-roots organization activities. 

Often, these activities are undertaken by various groups, without a broad 
strategy that considers the critical linkages between and among them. They 
often wind up as isolated, sometimes spectacular, efforts that fizzle out like 
fireworks (42). The SOCMOB approach aims at avoiding this fireworks 
syndrome. 

Because many development objectives involve far reaching changes, SOC- 
MOB is a promising strategy for specific health programs, as well as more 
global issues that affect development generally. Where needed, SOCMOB 
can be used to generate the critical political will that is essential for de- 
velopment; it also aims at the involvement of individuals at the community 
level in adopting positive behaviors. 

There is a place for marketing in both these approaches, as they stress the 
need to understand people and tailor inputs to the specific requirement of the 
communities concerned. The elements of marketing considered most critical 
for promoting healthy behavior include consumer or market research, product 
or service quality, a distribution network, product or brand image, price and 
consumer affordability, accessibility, consumer satisfaction, and promotion 
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(H. S. Dhillon, WHO). Mass media, which have a special place in marketing, 
are partners, not merely channels for health messages (41). 

Toward the beginning of the 1980s, Fox & Kotler (15) predicted that within 
the decade, marketing would be a regular feature of a growing number of 
nonprofit organizations. This is certainly the case, as far as the health sector is 
concerned. Social marketing, which helps stimulate demand and fine-tune the 
design and delivery of health messages and services, has a secure place in 
public health. The new thrusts of UNICEF and WHO, two of the key 
development organizations at the global level, are likely to confirm this in the 
years to come. 

Nevertheless, social marketing cannot solve public health problems on its 
own. Within the ranks of marketers, there is an active push for integrated 
marketing communication, which includes communication and education 
approaches. Social marketing, too, may be moving toward the broader 
approach. Not long ago, frustrated social marketers complained that health 
was simply not part of the marketing domain. It may still be so, but marketing 
is fast becoming part of the health domain. 
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INTRODUCTION 


In 1974, the Congressional Committee on Interstate and Foreign Commerce 
held hearings on unnecessary surgery. McCarthy et al (35) presented the most 
important evidence. Their findings from the first surgical second opinion 
program (SSOP) indicated that 17.6% of recommendations for surgery were 
not confirmed. The Congressional Subcommittee on Oversight and In- 
vestigations extrapolated these figures to estimate that nationwide there were 
2.4 million unnecessary operations performed annually, resulting in a cost of 
$3.9 billion and 11,900 deaths (47). 

This claim fell on the receptive ears of payers, including the Health Care 
Financing Administration (HCFA), who were beginning to feel the burden of 
accelerating increases in health care costs. Reducing costs by 15—20% was an 
appealing prospect, and several payers subsequently instituted mandatory 
SSOP. About the same time, HCFA and commercial insurance companies 
implemented preprocedural review programs for operations widely consid- 
ered overutilized. 

It is worth recalling that before 1970 public policy was concerned not with 
overuse, but with the problems of underuse of health services and perceived 
shortages of doctors and hospitals (16). But, as the full costs of Medicare 
became evident, cost containment entered the public agenda. Increasing cost 
pressures in the private sector, which resulted from technologic advances, 
also led private payers to search for ways to reduce health care expenditures. 
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Unnecessary surgery was an obvious target. Doctors have had less interest in 
unnecessary surgery, because they have tended to dissociate themselves from 
the debates concerning cost containment and because most do not recognize 
unnecessary surgery as a significant problem. No surgeon believes that what 
he or she does is unnecessary, and physicians are generally reluctant to pass 
judgment on their colleagues. 

The 1970s witnessed a remarkable profusion of mechanisms designed to 
change, or at least challenge, physicians’ decisions. In addition to SSOP and 
precertification programs, analysis of geographic variations of use of pro- 
cedures was provided to physicians as “feedback” with the hope of influenc- 
ing their patterns of use. “Managed care” was invented: an overt second- 
guessing of doctors’ decisions aimed at reducing days of hospitalization and 
use of expensive services. The government provided subsidies for the de- 
velopment of health maintenance organizations (HMO) and began to encour- 
age Medicare patients to enroll in them. 

Although significant reductions have subsequently been reported in hospi- 
tal stay or in the use of a particular procedure, it has been difficult to 
demonstrate that these programs have had the significant effect on utilization 
and costs that were anticipated. In fact, health care costs have continued to 
rise at two or more times the inflation rate. More imporantly, there is no 
evidence that these attempts to reduce utilization have had any effect on the 
quality of care. Recent reports suggest high (10-20%) rates of inappropriate 
use of a variety of services (3, 9, 20, 29, 37, 58, 59). Clearly, unnecessary 
surgery is still with us. 

Surgery has been a primary focus of attention for those interested in 
overuse of health care services for several reasons. First, it is easy to study: 
Most operations are reasonably standardized, outcomes are obvious, and the 
delivery of the service can be reliably ascertained from discharge or payment 
claims data. Second, operations are costly, in terms of both surgical fees and 
hospitalizations. More can be saved by reducing rates of surgery than by 
curtailing the use of most other therapeutic or diagnostic services. Third, 
surgical care is generally riskier than other forms of therapy. There is a finite 
mortality risk associated with almost every major operation. The combination 
of risk and potential for dramatic cure gives surgery an aura of excitement 
lacking in other forms of therapy. Abuse is, therefore, more serious and more 
intriguing. 

Finally, the current interest in unnecessary surgery also reflects recent 
increases in the number and types of operations performed in the United 
States. Many operations that are now performed frequently, such as coronary 
artery bypass, hip replacements, carotid endarterectomy, arthroscopy, lapa- 
roscopy, and heart and liver transplantation, were unkown just 25 years ago. 
Not surprisingly, some of the indications for these procedures are con- 
troversial. There has also been a substantial increase in the rates of perfor- 
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mance of well established operations, such as cataract extraction and cesarean 
section, which raises questions of overuse (3). Overall, the surgery rate grew 
at twice the rate of the population from 1979 to 1986 (27). Unnecessary 
surgery is, in many ways, a “disease of medical progress,” which reflects the 
hazards, as well as the benefits, of technological advances. A concern about 
its impact is timely, because the implications of unnecessary surgery are 
greater than ever, in terms of both the number of patients at risk and the 
aggregate cost. 

Recently, there has been a deliberate shift in federal emphasis. Responding 
to imprecations from the research community, Congress established the 
Agency for Health Care Policy and Research (AHCPR) in 1989. This new 
agency within the Public Health Service supervises and funds research into 
clinical outcomes and the development of practice guidelines, and it has 
launched a major effort in both areas. In addition, HCFA has pursued the 
development and implementation of analysis of large clinical data bases to 
demonstrate patterns of use that can be fed back to physicians or used by the 
Peer Review Organizations (PRO) as quality measures. 

Physicians have also become concerned about unnecessary surgery, as their 
colleagues have produced more convincing and sophisticated evidence of 
inappropriate use of some operations and procedures (8, 59). Professional 
specialty societies, individually and through the Council of Medical Specialty 
Societies and the American Medical Association, have started to develop 
practice guidelines to help physicians choose appropriate care (50). 

Why does unnecessary surgery occur? Why haven’t the above-mentioned 
programs worked? More importantly, will outcomes research, data base 
analysis, and practice guidelines get us where we want to go? 

In this review, I first consider how best to define the term unnecessary 
surgery and then summarize the evidence for its presence. Next, I examine the 
theories regarding the occurrence of unnecessary surgery, which leads logi- 
cally to a consideration of methods that have been recommended for reducing 
it. Finally, I consider the policy implications of these recommendations. 


WHAT IS UNNECESSARY SURGERY? 


The term “unnecessary surgery” has many meanings. The person who has had 
an operation that failed to relieve his symptoms may understandably conclude 
that the operation was unnecessary, even if the operation is successful in most 
individuals and its use is unquestioned. Others confuse unnecessary with 
“elective,” a term used by doctors in reference to timing, not as a synonym for 
“optional.” In contrast to an “urgent” or “emergent” operation, an elective 
operation is one that can be scheduled at a time of convenience, because the 
underlying condition does not pose an immediate threat to life or health. 
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The most common association of the term unnecessary surgery is with high 
frequency of use. Some cesarean sections are considered unnecessary when 
the rate of performance in a region exceeds some threshold number. One 
learns that there are “too many” hysterectomies performed, or that Dr. X 
performs unnecessary surgery because he does a higher number of a certain 
operation than his colleagues. Some carry this type of thinking to the extreme: 
Lower rates of surgery of all types must be evidence of higher quality medical 
care! This stands in interesting contrast to the use of vaccines, for example, 
for which most people associate higher use with better quality care. The 
difference is that vaccines are considered an unequivocal low-risk good, 
whereas operations carry greater risk and have been suspected of overuse. 

Although it might seem simplest to consider as unnecessary any operation 
that is not clearly necessary, this definition creates more problems than it 
solves. Webster defines necessary as something that “must be by reason of the 
nature of things,” “cannot be otherwise,” or “determined and fixed and 
inevitable” (53). No operation qualifies, for the simple reason that no opera- 
tion “must be” or is “inevitable” for any patient. A host of variables enter into 
the decision for surgery, of which the patient’s own values are among the 
most important. Individuals vary in their tolerance of risk, fear of surgery, 
desired activity level, tolerance of pain, and fear of death. They also vary in 
how they value different probabilities of good and bad outcomes. Clearly, it is 
not possible to define what is necessary for an individual—what is necessary 
for me may be totally unacceptable to you. 

In contrast, Webster’s definition of unnecessary, “useless,” is easy to use, 
as it can be based entirely on objective data. No operation is necessary if it is 
ineffective, i.e. if it does not accomplish its objective for a given clinical 
situation.! For example, if the objectives of coronary artery bypass graft 
(CABG) surgery are to relieve pain and prolong life, CABG is ineffective— 
and, therefore, unnecessary—for an asymptomatic patient with coronary 
artery disease that causes blockage of only one of the three coronary arteries, 
because studies have shown that CABG does not increase longevity in 
patients with single vessel disease. An unnecessary operation, then, is one 
that is ineffective or useless. An operation is also unnecessary if it confers no 
clear advantage over a less risky alternative. In both instances, the operation 
does not represent a net benefit to the patient. The patient will not be better 
off. This is the definition we will use. 


‘Rarely is an operation totally ineffective. Internal mammary ligation for the treatment of 
angina pectoris and glomectomy for asthma are examples. These operations were ultimately 
discredited by randomized trials. More commonly, an operation is effective for its initial use, but 
as experience is gained, the indications are broadened to conditions for which it is useless. 
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One other aspect of ineffectiveness must be considered: occult or un- 
recognizable unnecessary surgery. For many indications for operations, the 
evidence for effectiveness is clear-cut. But, there are other indications for 
some operations for which the evidence is absent or equivocal, and expert 
judgments are divided on its benefit. Some indications in this “gray zone” of 
effectiveness will one day be found to be inappropriate. Therefore, these 
operations represent occult unnecessary surgery, unrecognizable at the 
present state of knowledge. Operations performed for indications in this 
uncertain or equivocal category cannot now be fairly labeled as unnecessary, 
but some of them eventually will be. Other operations currently labeled as 
effective will be found to have only marginal benefit as more data are 
accumulated from outcomes studies. These, too, represent occult unnecessary 
surgery. 

Clearly, we cannot measure what is not recognizable, so it would be highly 
speculative to estimate the extent of occult unnecessary surgery. However, 
the implication is clear that the full extent of unnecessary surgery is greater 
than is measured by any of the current methods. For now, we must confine 
our analysis to what is known, i.e. to clinical situations in which the best 
available evidence or informed expert judgment indicates that an operation is 
ineffective or useless. What do we know? 


THE EVIDENCE 


Evidence for unnecessary surgery comes from three types of studies: circum- 
stantial evidence from studies of variations of rates of use of various op- 
erations among different geographic regions and between different types of 
practices, denial rates of second opinion and precertification programs, and 
attempts to measure inappropriate use directly. 


Geographic Variations 


It is not unreasonable to assume that if an operation is being performed ten 
times as frequently in one area as in another, the high use must represent 
unnecessary surgery, although an equally logical conclusion is that the low 
rate represents underuse. There is probably some truth in both interpretations. 

In 1969, Lewis (31) published the results of a study of variations in rates of 
use of six common surgical procedures by Blue Cross enrollees in 11 health 
planning regions in Kansas. He found that rates varied by as much as 3.8 
times and attributed the higher rates to overutilization. Wennberg & Gittel- 
sohn subsequently found similar variations in Vermont (57) and Maine (56) 
and they also attributed the higher rates to overuse. Significant (as high as 
tenfold) variations have been noted for some operations, and regional varia- 
tions in use have been found throughout the US, as well as in Canada (43, 
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51), the United Kingdom, and Norway (36). Although most studies have 
compared small areas (counties or hospital-service regions), significant varia- 
tions have also been observed between large areas (states or parts of states) 
(8). 

Many studies have been performed to identify the factors that lead to 
geographic variations and to determine if variations do, in fact, represent 
unnecessary surgery. The simplest and most obvious explanation for differ- 
ences in rates of surgery would be differences in the incidence of disease. 
Curiously, this aspect of geographic variations has not been extensively 
investigated, perhaps because the magnitude of the variations in use of 
operations far exceeds any likely differences in underlying disease rates. In 
addition, most studies have compared adjacent small regions, in which similar 
populations and environments would be expected to result in similar rates of 
disease. In one of the few specific attempts to correlate surgical rates with 
disease incidence, Roos et al (45) found no relationship between rates of 
respiratory infection and rates of tonsillectomy. 

Geographic variations could reflect differences in supply or demand. De- 
mographic predictors of the demand for surgical services have been ex- 
tensively studied. Surgical utilization is higher in older patients, women, 
those with higher incomes, and noncollege graduates (4, 22, 44). However, 
none of these factors explains more than a small fraction of geographic 
variations (22, 44, 56). Physicians and their spouses have higher rates of 
surgical treatment (6), but these differences also have not been related to 
geographic variations. 

The effect of supply on geographic variations in use is conflicting. Health 
maintenance organizations and other managed care plans reduce the use of 
surgical care (33), but HMO participation rates have not been linked to 
geographic variations. Lewis (31), Wennberg & Gittelsohn (57), and Stock- 
well & Vayda (46) found that the number of available hospital beds was 
strongly related to geographic variations, but Roos (42) found no relationship 
between hysterectomy rates and bed availability. The number of surgeons has 
been correlated with regional surgical rates (1, 5, 31, 57), again with some 
exceptions (45). 

Variations resulting from differences in demand and supply do not neces- 
sarily represent evidence of unnecessary surgery. These variations could 
reflect differences in the ability of patients to access useful health care. 
Certainly, Bunker’s (6) finding that physicians and their spouses have higher 
surgical rates than others supports that possibility. 

Many investigators consider that the practice style of physicians is the most 
important determinant of regional variations. Wennberg has noted remarkable 
consistency from year to year in regional use rates and has referred to these 
patterns of high or low use as “surgical signatures.” He considers these local 
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practice styles the most important variable that explains variations (55). 
Individual surgeons’ caseloads can have a significant effect on overall region- 
al rates, especially in small areas where a few high-volume surgeons can 
markedly alter the local rate (30, 45). 

Why do surgeons’ practice patterns vary so much? Wennberg (54) and 
others claim that variations in practice style reflect the degree of uncertainty in 
surgical decision-making. Eddy (13) has noted that as a result of the rapid 
pace of biomedical advances, the degree of uncertainty about the effective- 
ness of many forms of therapy is greater than ever. Uncertainty leads to 
variations in how physicians perceive the value of a procedure and, hence, to 
variations in its use. 

If the practice style hypothesis is correct, then geographic variations in the 
use of operations will be greatest when the level of uncertainty as to their 
value is high and least when it is low. That is what has been found. In all 
geographic variations studies, surgery for inguinal hernia and fractured hip 
show the least differences. There is little disagreement about the indications 
for surgical treatment for these conditions. Controversial operations, such as 
carotid endarterectomy and laminectomy (disc operations), have the greatest 
variations (8, 25, 59). It is also of interest that when physicians were informed 
that their individual rates were substantially above state averages they reduced 
their rates, which suggests that they recognized some overuse (54). 

Two studies have attempted to measure the contribution of unnecessary 
surgery to regional variations in use rates directly. Roos et al (45) studied the 
relationship between tonsillectomy rates and adherence to standards for in- 
dications. Although they found high rates of inappropriate use, there was no 
correlation with rates of tonsillectomy. The RAND Health Services Utiliza- 
tion Study examined the relationship between the appropriateness of in- 
dications, as determined by an expert panel, and utilization of three pro- 
cedures that showed substantial variations in use between large regions 
(states). For the one operation studied, carotid endarterectomy, they found a 
high rate of inappropriate use (32%), but no difference in the rate of in- 
appropriate use between high and low use areas (9). Analysis of the data for 
one state also showed no differences in rates of inappropriate use among small 
areas (counties) (29). 

In summary, although geographic variations in the use of surgical pro- 
cedures result from a multitude of factors, the most important seems to be 
variations in physician perceptions of the value of the operation in question. 
These differences result from lack of professional consensus about the value 
of many procedures for many potential indications. Thus, geographic varia- 
tions are primarily a measure of professional uncertainty. In the few instances 
in which unnecessary surgery has been directly measured, the extent is 
greatest for procedures about which there is little professional consensus. 
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Thus, large geographic variations in the use of an operation suggest that a 
significant fraction is unnecessary. The extent of variation is not a direct 
measure of the extent of unnecessary surgery, however, as inappropriate use 
represents only a minor share of the differences. 


Variations by Method of Payment 


Another type of variation in practice patterns that has been cited as evidence 
of inappropriate care is the difference in surgical rates between HMOs and 
private practice. Patients who receive care in HMOs are one-half to one- 
fourth as likely to be operated on as patients in the fee-for-service sector (132, 
49). Overall, reductions in use of medical services in HMOs does not have a 
deleterious effect on outcomes (10), so it can be inferred that the difference in 
surgical rates is at least partly due to unnecessary surgery. 


Second Surgical Opinion Programs 


Although the specific purpose of SSOP is to reduce the rate of unnecessary 
surgery, 17 years after their introduction we still do not know whether they 
do. McCarthy, who introduced the first SSOP in 1974, did not claim that it 
identified unnecessary surgery, but carefully stated that its purpose was “to 
help the patient make a more informed decision” (34). Others were less 
restrained. The Congressional Subcommittee on Oversight and In- 
vestigations, which as mentioned above equated McCarthy’s 17.6% 
nonconfirmation rate in his mandatory SSOP with unnecessary surgery and 
thus estimated that 2.4 million unnecessary operations were performed an- 
nually, later strongly recommended the use of a second opinion program in 
the Medicare and Medicaid programs. The Department of Health and Human 
Services promptly instituted a national voluntary second opinion program for 
Medicare, and seven states introduced mandatory second opinion programs in 
their Medicaid programs. Private health insurers began to offer second opin- 
ion programs, and by 1984, 28% had mandatory SSOP, including 60 Blue 
Cross/Blue Shield plans. Employers also instituted mandatory second opinion 
programs; by 1988, a survey of 240 major US firms found that 62% reduced 
coverage if a second opinion was not obtained (39). 

It is clear that the major reason for instituting second opinion programs has 
been to control costs by reducing rates of surgery. In the absence of controlled 
studies, it is not possible to conclude whether nonconfirmation rates or 
nonoperative rates measure unnecessary surgery, but overall rates of surgery 
have been reduced by SSOPs—at least initially (28). At Senate hearings in 
1985, the Inspector General reported that Medicaid mandatory programs in 
three states had reduced utilization 20-35% and saved $7.5 million (48). In 
their study of the Massachusetts Medicaid SSOP, Poggio et al (41) estimated 
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a total reduction in rates of performance of the targeted operations of 24%. 
Almost all of this reduction was due to the “sentinel effect”—a decline in 
recommendations for surgery because of the physicians’ knowledge that a 
second opinion would be obtained. The direct effect of the nonconfirmations 
(allowing for those who decide to have the operation anyway) only reduced 
the surgical rate by 2%. Few studies have addressed the effect of second 
opinion programs on health status, and none have assessed health status of the 
target population before and after the surgical decision. 

Recently, enthusiasm for SSOPs has cooled as more and more programs 
report high confirmation rates (and, therefore, little direct savings.) Lacking 
plan-specific information about the sentinel effect, many payers apparently do 
not consider its savings to be relevant. Several insurers and large companies 
have given up their programs because they cost more than they save (39). 
Finally, HCFA, which never fully implemented the Congressional mandate 
for SSOPs, has recently withdrawn its proposed regulations for mandatory 
SSOPs for both Medicaid and Medicare, citing controversies over cost- 
effectiveness, concern over creating barriers to care, and its desire to decrease 
the number of mandated programs (39). 

What does the experience with second opinion programs tell us about the 
extent of unnecessary surgery? Unfortunately, less than one might hope. 
Although many studies have been made of nonconfirmation rates and cost 
savings, none have directly addressed the question of whether nonconfirma- 
tion accurately identifies operations that should not be performed. In the 
absence of controls, it is not even possible to tell if the supposed benefits of 
forgone operations are, in fact, realized. The absence of outcome data even 
prevents evaluation of the supposed benefits to SSOP patients of 
nonconfirmation. 

More fundamentally, the characteristics of second opinion programs make 
it unlikely that they either identify or diminish the rate of unnecessary 
surgery. Second opinion programs are a form of unstructured implicit review, 
i.e. the second surgeon makes a judgment based on his own knowledge and 
experience, not according to any explicit or agreed-upon criteria of 
appropriateness. In other clinical studies, unstructured implicit reviews have 
had low reliability and questionable validity. Second, the process does not 
distinguish between differences of opinion, in which evidence is scant and 
experienced surgeons disagree, and differences of fact, in which one of the 
surgeons (either one) has supporting scientific knowledge that the other is 
unaware of. 

Finally, the use of peers (specialists of equal qualification) to provide the 
second opinion insures that SSOPs do not improve the quality of care, but 
actually make it worse (28). If each surgeon has a similar, independent, 
random error rate, the errors will be cumulative. For example, for con- 
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troversial operations, we can reasonably assume that each surgeon is wrong in 
his judgment 10% of the time. Ten percent of the first surgeons’s recom- 
mendations for surgery will be inappropriate. If the second opinion surgeon 
also has a 10% error rate that is independent of the first’s, he or she will 
approve surgery for 90% of all of those for whom it was recommended. Thus, 
surgery will be approved for 90% (9/10) of those whose first recommendation 
was erroneous. And, it will be disapproved for 10% (9/90) of those for whom 
the first recommendation was appropriate, and who would benefit from 
surgery. Thus, inappropriate treatment will be advised for 18% of patients, 
instead of 10% if the second opinion had not been obtained. 

For all of these reasons, it is not possible to draw any conclusions about the 
extent of unnecessary surgery from the experiences of second opinion pro- 
grams. 


Precertification Programs 


Precertification programs are designed to identify potentially unnecessary 
operations before they are performed. Denial rates (the fraction of proposed 
operations for which payment is denied by the carrier) thus may be measures 
of unnecessary surgery, as these operations presumably would have been 
performed if payment had not been denied. For competitive reasons, com- 
mercial carriers do not release information about denial rates. However, 
information is available for public programs, particularly the results from the 
PRO, which performs the quality assurance function for Medicare. For certain 
designated operations, i.e. those that are widely considered overused, the 
PRO requires the surgeon to obtain approval before the patient can be 
admitted to the hospital or have the operation. A two-stage process of review 
is used. Nurses assess the appropriateness of the proposed surgery according 
to whether the patient meets explicit screening criteria for that procedure. If 
the patient does not, the case undergoes implicit review by a PRO physician. 
Screening criteria are developed by the PROs individually and vary widely 
(26). They are often oriented toward establishing the presence of disease, not 
whether the proposed operation is appropriate for the individual patient. 
Consideration may not be given to severity of disease, comorbidity, possible 
alternative treatments, or even outcome probabilities. Thus, the aggregate 
denial rate for PROs nationwide was, not surprisingly, only 2.3% in 1988 and 
1.6% in 1990 (A. Webber 1988, personal communication). 

Recently, several large insurance companies have instituted computerized 
preprocedural review programs by using a commercial product that applies 
highly detailed criteria that have been developed with the RAND/UCLA 
appropriateness methodology. Blue Cross/Blue Shield reported the results of 
a pilot program that used these criteria for 21 procedures in six states over a 
one-year period ending in July 1990. The overall rate of inappropriate pro- 
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posed use was 11.2%; individual rates of inappropriateness varied from 0% 
for CABG to 21.5% for hysterectomy and 27.1% for tonsillectomy (27). 


Direct Measurement 


In 1953, Doyle (11) attempted one of the first direct measurements of 
unnecessary surgery. He reviewed the records of 6248 hysterectomies per- 
formed in Los Angeles and found that 39% were “unjustified.” Because these 
were implicit judgments, his findings are open to the same criticisms leveled 
at second opinion programs: absence of specific criteria and lack of distinction 
between difference of opinion and difference of expertise. In a recent study of 
multiple reviews of 50 cases of cesarean section performed for fetal distress, 
Barrett et al (2) found that five reviewers agreed in their judgments in only 
28% of cases. Nonetheless, 30% of the operations were judged inappropriate 
by four of five reviewers. 

Because of dissatisfaction with implicit reviews, serious students of quality 
assessment have turned to explicit criteria that can be applied in a standard- 
ized fashion. The development and application of these methods has ex- 
panded dramatically in the past few years. Valid judgments require that 
explicit criteria be comprehensive, detailed, and clinically relevant, and that 
they clearly specify the conditions under which an operation is either appro- 
priate or inappropriate. To develop criteria, investigators have used evidence 
in the literature, judgments of experts, or a combination of the two. The 
process may be informal and qualitative or highly structured and semiquan- 
titative (17, 38). The source of information about the patients also varies. 
Although reimbursement claims data have been used by some, the level of 
clinical detail recorded is seldom adequate to permit judgments of 
appropriateness, so most investigators have relied on review of medical 
records. 

Table 1 summarizes the findings from the literature of studies of un- 
necessary surgery by using explicit criteria. These findings are the most 
convincing evidence of unnecessary surgery, which appears to occur in 
8-86% of patients, depending on the procedure studied and the criteria used. 
Although these figures are disturbingly high, it is important to note that they 
are not a fair representation of surgery in general. These operations were 
selected for study precisely because they were controversial, because of either 
substantial geographic variations in use or other evidence of overuse. 

In summary, the evidence for unnecessary surgery is largely circumstantial. 
Geographic variations reflect uncertainty and, thus, indicate the presence of 
unnecessary surgery, but do not measure its extent. Second opinion program 
results represent only differences of opinion between two individuals, a thin 
reed upon which to make a judgment of inappropriate use. Criteria studies, 
and the results from the related use of explicit criteria for precertification 
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Table 1 Explicit criteria studies 








% Type of 
Year Operation Number Unnecessary Source criteria Reference 





1977 _—_‘ Tonsillectomy 3072 (45) 
1986 Carotid endarterectomy 107 13 (39) 
1988 Carotid endarterectomy 1302 32 ( 9) 
1988 CABG 386 14 (58) 
1988 Pacemaker insertion 382 (20) 
1990 Hysterectomy 257 (18) 
1990 CABG 320 (19) 


5826 





Payment claims. 

Patient record review. 

Criteria developed from literature review. 

Criteria developed by structured process of literature analysis and consensus of experts. 
Criteria developed by group of experts. 
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programs, do provide concrete evidence of unnecessary surgery. From these 
studies, it is reasonable to conclude that 10% or more of surgical procedures 
are unnecessary. For controversial operations, the fraction may be sub- 
stantially higher. 


WHY DOES UNNECESSARY SURGERY OCCUR? 


Why do surgeons perform unnecessary surgery? It is difficult to believe that 
many do so deliberately, out of greed or malice. A judgment that a physician 
is performing unnecessary surgery implies that the operation is known to be 
inappropriate for a given condition. But, no one knowingly performs a useless 
operation. Therefore, the surgeon either does not know that it is inappropriate 
or does not accept the evidence. 

Unfortunately, it is often not clear what is “known” in medicine. Contrary 
to popular assumptions, most accepted medical therapy is not based on 
scientific evidence of effectiveness. Acceptable therapy, therefore, includes 
both those treatments for which there is good evidence of effectiveness and 
those for which the evidence is scant, but the weight of professional opinion is 
favorable. Further, science is what scientists say it is, so even the acceptabil- 
ity of scientific data relies on the belief that the conclusions are valid. In the 
absence of a consensus, whether based on evidence or expert opinion, a 
judgment regarding the necessity of a given treatment is impossible. 

Several considerations determine whether a consensus will develop and 
whether an individual physician will know and accept the consensus judgment 
on the appropriate use of any treatment: the methods by which scientific 
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knowledge is developed, the manner by which it is disseminated, and a host 
of social and psychological factors. 


Knowledge Development 


The development of information about the usefulness of a new technology 
takes place in stages, starting with the demonstration that a procedure works. 
The inventor typically provides evidence from a clinical trial that shows that 
the procedure is effective for treating a certain condition. Ideally, this demon- 
stration of potential value should lead to a randomized clinical trial, but in 
practice most operations are not evaluated systematically early in their dis- 
semination. More typically, the innovator and others next explore the range of 
applications for the new therapy and discover and report on complications and 
problems associated with its use. If the apparent benefits are substantial and 
the risks are not excessive, widespread use may rapidly follow. Identification 
of conditions for which the new technology is not useful occurs slowly over 
time as experience accumulates. Like other negative findings (7), in- 
appropriate use is often not reported. Rarely, and almost always much later, a 
new treatment may be the subject of a randomized clinical trial. 


Knowledge Dissemination 


The primary sources of scientific information that the physician turns to are 
reports of clinical research in medical journals. This research, usually carried 


out at an academic medical center, establishes the efficacy of a procedure, i.e. 
how it works under controlled experimental conditions, not the effectiveness, 
i.e. how the procedure works in practice. In addition, clinical research is 
usually directed at defining whether an operation works, not at identifying the 
circumstances under which it may not be of value. 

Journal articles have other limitations. Because of their academic origins, 
the studies may reflect environments in which personnel and equipment 
resources are more extensive than in community practice. There is a referral 
bias, in that patients referred to academic medical centers are not a cross- 
section of those in the community. As a consequence, results from pop- 
ulation-based studies of outcomes are almost invariably inferior to those 
reported in journal articles from academic centers. Journal reports are also 
biased toward favorable outcomes. Rarely are poor results reported. The 
volume of journal articles is virtually overwhelming. It is impossible for any 
physician to read regularly all of the journal articles that contain information 
relevant to his or her practice. For all of these reasons, journal articles alone 
are not an adequate basis for determining when an operation is indicated. 

Another problem with journal articles is that clinicians can rarely use them 
to make decisions for a specific patient, because the information is frag- 
mented, unconnected, and difficult to evaluate. A variety of treatment options 
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are presented without adequate information needed to evaluate and compare 
them. This is both a quantitative problem—the volume of information is 
staggering—and a qualitative one: how to sort out useful information from a 
plethora of irrelevancy. It is clearly impossible for individual physicians to 
evaluate all forms of therapy for all of the conditions that they are called upon 
to treat (13). 


Development of Clinical Consensus 


Because of the voluminous, fragmentary, and disconnected nature of medical 
scientific information, physicians turn to various authorities—experts—for 
assistance in forming their conclusions about the value of a treatment or 
procedure. They obtain expert advice through textbooks, medical meetings, 
continuing education courses, and informal contacts. 

Textbooks and continuing education courses provide comprehensive over- 
views and perspectives on the state of practice in the use of operations. 
Emphasis is usually placed on the rationale for use of operation and the details 
of diagnosis and management. Rarely is much attention given to explicit 
consideration of contraindications, the conditions under which an operation is 
not indicated. Medical meetings provide physicians with multiple opportuni- 
ties to learn from their colleagues. Presentations, like journal articles and 
textbooks, tend to concentrate on outcomes and problems for broad general 
groupings of patients, e.g. “Limb salvage in 600 patients undergoing femoral- 
popliteal bypass.” However, meetings provide physicians with an opportunity 
to question the experts. Discussions with colleagues help physicians shape 
their own perception of the value and use of a procedure. 


Physician Use of New Information 


Because the fragmented and biased nature of available information makes it 
difficult for the average surgeon to keep up, some unnecessary surgery 
undoubtedly results from ignorance. But, sometimes the information is re- 
ceived and rejected. Like others, physicians resist change and they have 
learned not to believe everything they hear from the experts. Most physicians 
can readily recall examples of brilliant ideas that were eventually discredited. 
If new information is contrary to personal experience or an expert opinion 
seems ill-founded, the wise clinician adopts a wait and see attitude. This may 
be true even when there is an expert consensus on an issue. In a study of the 
adoption of recommendations of the National Institutes of Health Consensus 
Conferences, Kanouse et al (24) found that the availability of new information 
concerning the value of a procedure or treatment was insufficient to bring 
about changes in practice patterns. 
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Psychosocial Factors 


Individual personality and character traits are important determinants of 
practice style. Age, experience, personality, and specialty influence how 
physicians use tests (15). Family practitioners and internists approach patient 
care differently than surgeons, who may be less risk averse. Training and 
tradition play an important role: Doctors tend to continue doing something the 
way they always have if it seems successful. Motivation, which is a complex 
interplay of self-image, personal standards, and preferences for certain types 
of practice style or patients, is also a key factor (14, 55). Much has been made 
of economic motivation in recent years, but it is unlikely that many surgeons 
recommend useless operations solely because of greed. It seems probable, 
however, that in questionable cases, they are more likely to recommend a 
service they provide. As we have seen, there is evidence that doctors in 
fee-for-service practice recommend more operations than doctors in prepaid 
plans (32, 49). 

Practice patterns may be influenced more by social factors than by personal 
ones (21). Studies of the adoption of new technologies have shown that 
acceptance of a new treatment occurs as a result of a consensus of peers, and 
that endorsement by a local opinion leader is usually required before general 
use will occur (21). Community physicians tend to distrust the scientific 
literature, as much of it is inaccessible or irrelevant. Consequently, they rely 
more heavily on word of mouth and the evaluations of colleagues. Local 
norms, which may vary considerably from region to region, are developed 
(the “surgical signatures” of Wennberg). Interestingly, although there has 
been scientific interest in the factors that lead to adoption of new technolo- 
gies, few investigators have studied the process by which ineffective treat- 
ments are abandoned. Unless abuses are egregious and the evidence is 
unequivocal, leaders seldom speak out against an outdated procedure. In the 
absence of social pressure, physicians are often slow to change. 

The net effect of our system of generation, dissemination, and incorpora- 
tion of medical scientific information is to leave practicing physicians without 
clear guidance as to effective treatment in many situations. There is often little 
or no consensus, or if there is a consensus among experts, the community 
physician may not be aware of it. 


Lack of Consensus 


Lack of consensus has a profound effect on the nature of medical practice. 
Lack of consensus leads different groups of doctors to different conclusions 
about the value of an operation, the major cause of geographic variations. 
Lack of consensus leads to differences of second opinions from the first. And, 
uncertainty stemming from lack of consensus leads surgeons to recommend 
operations for patients who desire them but who will not benefit. 
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In summary, although many factors play a role in the occurrence of 
unnecessary surgery, the root cause is the inadequate production, evaluation, 
dissemination, and use of information. If the limits of effectiveness of an 
operation were established for all of its uses, and if that information was 
widely accepted and widely disseminated, the opportunities for misuse would 
be greatly diminished. 


WHAT CAN BE DONE ABOUT UNNECESSARY 
SURGERY? 


It is evident that utilization review, second surgical opinion programs and 
geographic variation analysis with feedback are blunt instruments for improv- 
ing quality. Each takes advantage of the pervasive uncertainty in medical 
practice to intimidate or exert peer pressure on physicians to conform. 
Although these programs may decrease the volume of services that are 
provided, they do so unselectively. As we have seen, there is no evidence that 
any of these programs specifically reduces inappropriate care, and, in the case 
of SSOP, it possibly increases. Without a focus, these programs are unlikely 
to lead to identifiable improvements in quality. 

To decrease unnecessary surgery, it is first necessary to define it. Physi- 
cians need better information on effectiveness and better dissemination and 
use of that information. Finally, attention must be given to developing more 
effective ways to get doctors to accept and use new information. Outcomes 
research attempts to improve the information base, whereas practice guide- 
lines make it more accessible to physicians. 


Outcomes Research 


The randomized clinical trial is widely accepted as the gold standard for 
measuring effectiveness, but the costs and logistic problems of conducting 
these trials limit their applicability. In recent years, the effectiveness of a 
treatment has increasingly been evaluated by sophisticated analyses of patient 
outcomes. As noted, the AHCPR has launched a major effort—the Medical 
Treatment Effectiveness Program—to evaluate the outcomes of treatment of 
several important diseases, such as cataract, myocardial infarction, and back 
pain. 

Although the importance of studying outcomes is unassailable, ex- 
pectations regarding the usefulness of this information may be exaggerated. 
Meaningful information from outcomes studies requires evaluation of numer- 
ous health factors in addition to the presence of disease. Controlling for these 
variables can be difficult and expensive; thus, effectiveness studies of all 
variants of patient and disease are not possible. Consequently, it is unlikely 
that outcomes studies can ever provide information on more than a minor 
fraction of the thousands of diseases and treatments. 
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This is not to say that outcomes studies should not be carried out. On the 
contrary, they offer the most practical hope of obtaining valuable information 
that will both validate treatments that work and lead to elimination of those 
that do not. But for maximum efficiency, outcomes studies should be focused 
on specific treatments for which the information will be of greatest value, i.e. 
on those that are performed in large numbers, show wide geographic varia- 
tions, and are controversial. Despite its limitations, outcomes research is the 
best current hope for improvement in knowledge generation. 


Practice Guidelines 


The other challenge, getting physicians to accept and use effectiveness in- 
formation, may be more difficult. The current movement to develop practice 
guidelines is an attempt to accomplish this mission. The objective of practice 
guidelines is to make effectiveness information accessible and acceptable to 
doctors by providing authoritative statements regarding the appropriateness of 
a procedure for all of its possible indications. These statements are based on 
available evidence and expert opinion. As we have seen, even when there is 
good scientific data, guidelines are needed to translate that information into a 
useable form. If well done, guidelines provide practicing physicians with a 
better informed, more objective, and, therefore, wiser evaluation than they 
can readily obtain from the literature and personal experience. The develop- 
ment of comprehensive practice guidelines is an urgent first priority for 
anyone who wishes to decrease the rate of unnecesssary surgery. Fortunately, 
the urgency of that need has risen to national attention within the past several 
years and has been accepted both within the medical profession and by the 
government. 

Practice guideline development is the second major responsibility of the 
AHCPR. Recently, at the agency’s behest, a committee of the Institute of 
Medicine issued a set of attributes and principles for the guideline develop- 
ment process (23). The report stressed the importance of credibility and 
accountability and the need for the link between guidelines and scientific 
evidence to be explicit. It strongly recommended that the guideline develop- 
ment process “include participation by representatives of key affected groups 
and disciplines” to insure that all relevant evidence is located, that practical 
problems are identified, and that affected groups will cooperate in im- 
plementation (23). 

Professional specialty societies have also begun to develop comprehensive 
and highly specific practice guidelines (50). Early experience suggests that 
these guidelines will be used and will make a difference. For example, 
following the 1987 universal adoption by Massachusetts anesthetists of the 
American Society of Anesthesiologists’ “Standards for Basic Intra-Operative 
Monitoring,” the number of deaths from hypoxia decreased to zero in the 
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following year, and for the first time no lawsuits were filed for hypoxic 
damage (40). 

Ultimately, the validity of practice guidelines will depend on the advances 
in scientific knowledge provided by randomized clinical trials and outcomes 
studies. The commitment of Congress to support outcomes studies is, there- 
fore, encouraging. These efforts are complementary. Outcomes data provide 
evidence to be used in guideline development, while the guideline process 
helps focus outcomes research by identifying common clinical conditions for 
which effectiveness information is lacking. 


POLICY IMPLICATIONS 


As Eddy (12) has described, medical practice is in the middle of a profound 
transition. Once it was assumed that physician’s decisions were, by defini- 
tion, correct; however, evidence now indicates that many are not, and mech- 
anisms have been established to second-guess physician judgments. Eddy 
points out, however, that much of medical care is effective, doctors are not 
practicing fraud, and the problems are no one’s fault. Physicians make errors 
because they must make decisions every day on the basis of inadequate 
information. And, they must deal not only with vagaries of scientific knowl- 
edge, but also with variations in patient preferences and expectations, chang- 
ing systems of reimbursement, threats of malpractice, and peer pressure. 

The pace of technologic progress is now such that it has become impossible 
for researchers to provide the information on effectiveness of new treatments 
as rapidly as they are developed. Further, our methods of information dis- 
semination are not adequate to make even that which is known accessible to 
the physician. As a result, it is not surprising that evidence from a variety of 
sources suggests a substantial amount of surgery is unnecessary. The solu- 
tions to these problems will not come quickly or easily, but the movement to 
practice guidelines should ultimately lead to more rational and more accept- 
able medical decision-making. 

Credibility of practice guidelines requires that the judgments be made by 
respected clinical experts, leaders in their fields. But, that may not be enough. 
Surgeons, in particular, are unlikely to accept these recommendations without 
professional endorsement. Surgical leaders must accept the process and sup- 
port the results. The AHCPR has been slow to enlist the support or participa- 
tion of organized medicine—either the American Medical Association or the 
relevant specialty societies. Also, the academic establishment’s position is 
unclear. Some health services researchers are very interested in various 
aspects of guideline development, but they have had limited input into the 
federal process. Without either professional or academic support, it is hard to 
believe that federal guidelines will be accepted. A related question is whether 
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the government will require physicians to follow the federally developed 
guidelines. Such a requirement would result in the de facto nullification of 
one of the highest functions of a profession: control of its standards. It seems 
improbable that either organized medicine or individual doctors will readily 
accept such an outcome. If the legitimacy of federal guidelines is challenged, 
as it almost certainly will be, it is not likely that either Congress or the courts 
will support the right of the federal government to practice medicine. To be 
viable, therefore, practice guidelines must be supported by either the academ- 
ic establishment or organized medicine, preferably both. 

The use of practice guidelines will result in significant changes in the way 
doctors practice medicine. For the first time, the identification and significant 
reduction of inappropriate care and unnecessary surgery will be possible. 
Whether this potential will be realized will be determined within the next few 
years by the interplay of government policies and professional reactions. If 
means to cooperate cannot be found, reductions in unnecessary surgery may 


be long in coming. 
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BACKGROUND 


In 1978, the United States established the objective of completing the basic 
immunization series of at least 90% of children by age two years by the year 
1990. Although state school immunization laws have led to the immunization 
of over 95% of school enterers (32), the situation among preschoolers is less 
encouraging. Recent outbreak investigations in many inner city areas have 
estimated that only 40-60% of children have completed the series by age two 
years (12, 13, 55). This low coverage among preschoolers is reflected in the 
recent resurgence of measles (7, 11, 57). In 1990, the number of reported 
measles cases (provisional total 27, 672) was the highest since 1977 (55, 201 
cases reported), compared with a nadir of 1497 cases in 1983. Approximately 
one half of reported cases in 1990 were among preschool children; among 
vaccine-eligible preschoolers aged 16-59 months, 79% were unvaccinated 
(10a). 

There is no mechanism similar to school immunization laws to achieve 
universal immunization of preschoolers. State day care immunization laws 
only affect licensed centers, which care for an estimated 20% of children 
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under age 6 years who have working parents (70). There is, therefore, a need 
to design appropriate interventions to increase coverage among preschoolers. 

Determinants of receipt of immunization are complex and interwoven. In 
this paper, we review published studies of determinants of receipt of im- 
munization in the US. We classify factors reiating to receipt of immunization 
in two broad categories: consumer demand for services and the supply of 
services. Although there is great interplay between the different factors, we 
use this classification to help identify potential interventions to increase 
preschool immunization coverage. 


CONSUMER DEMAND FOR IMMUNIZATION SERVICES 


Factors affecting consumer demand for immunization include beliefs about 
health care and illness and socioeconomic characteristics of the individual and 
of the social group with which the individual interacts. 


Health Beliefs 


Many theoretical models that share common dimensions have been developed 
to explain utilization of such preventive health services as immunization (17). 
One of the frameworks most commonly used to understand and predict 
health-promoting behavior is the “health belief’ model, which was formally 
developed by Rosenstock and colleagues (63). This model has four com- 
ponents: “perceived susceptibility,” which refers to the subjective perception 
of risk of vulnerability to a health threat; “perceived severity,” which consists 
of one’s perception of the seriousness of the health threat; “perceived bene- 
fits,” which consists of the belief that the health-promoting behavior will be 
effective; and “perceived barriers,” which refers to the assessment of the 
negative consequences associated with the behavior, such as cost, in- 
convenience, negative perceptions of health services, and side effects. 

Early descriptive studies, which used the health belief framework, were 
conducted on the acceptance of the Salk polio vaccine (14, 27, 50). More 
recently, analytic studies have been conducted on the receipt of influenza or 
swine flu vaccine by adults (1, 18, 38, 66). Janz & Becker (35) reviewed 
studies of health-promoting behavior, clinic utilization, and behavior during 
illness. Perceived barriers had the most frequently reported impact on be- 
havior (93%), followed by perceived susceptibility (86%), benefits (74%), 
and severity (50%). 

Health beliefs motivate the individual to act, but the appropriate behavior 
may not occur unless a cue to action is present. Such cues can be internal (e.g. 
symptoms) or external (e.g. mass media messages, advice from friends or the 
medical profession) (2). Persons who seek vaccination are likely to discuss 
vaccination with friends, peer groups, or physicians or to consider that most 
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of their friends have been vaccinated (14, 18, 20, 27, 28, 50). Persons of 
higher socioeconomic status are more likely to have a wide circle of friends 
and to seek advice outside the family (14). 

More recently, the health belief framework has been further extended to 
include the individual’s assessment of his or her ability to carry out the 
health-promoting behavior successfully (“self-efficacy”). This “protection- 
motivation” theory includes many of the components of the health belief 
model, with the addition of the role of self-efficacy (60). Self-efficacy 
influences not only the initiation of the health-promoting behavior, but also a 
person’s persistence in the face of obstacles. 

In summary, motivation to undertake a health-promoting action is in- 
creased when the individual feels vulnerable to a severe threat to health, 
perceives that the action is effective and that he/she can confidently carry out 
the action, and when perceived costs and barriers associated with the action 
are small. Positive motivation is most likely to lead to effective action in the 
presence of a “cue.” Sociodemographic characteristics may act through any of 
these components to affect the likelihood of a health-promoting act. 


Socioeconomic Status 


Economic and demographic measures of socioeconomic status, particularly 


parental education, income, family size, and race, have consistently been 
shown to influence receipt of immunization (4, 14, 18, 26, 27, 29, 43, 44, 46, 
48, 50, 52, 58, 62). A national telephone survey of access to health care, 
conducted in 1986 by the Robert Wood Johnson Foundation, found that 
children who were uninsured, poor, or nonwhite were less likely to have seen 
a physician in the past year, and uninsured children under age 5 were less 
likely to have up-to-date immunizations. According to parental history, 19% 
of uninsured children, 6% of children with private insurance, and 1% of 
children on Medicaid were not up-to-date (74). 

Many persons at or near the poverty line lack health insurance. In 1986, 
approximately 16% of children under age 13 were covered by public health 
insurance (usually Medicaid), and 18% were uninsured. Medicaid availability 
varies between states, which may impose limitations on eligibility for ser- 
vices, the frequency and number of services provided, and physician 
reimbursement (47, 64). Fees are often well below those paid by the private 
sector; thus, physicians are discouraged from participating in Medicaid (77). 
Medicaid’s complex administrative procedures also obstruct use (5, 68, 77). 
In 1986, Medicaid covered less than half of eligible children (64). Pediatri- 
cian participation in Medicaid is lowest in large metropolitan areas, in which 
the risk of early acquisition of vaccine-preventable diseases is highest (77). 
Medicaid children, like uninsured children, receive more of their care in 
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hospital outpatient departments, emergency rooms, and public clinics, than in 
physician’s offices. 

Provision of health insurance increases service utilization, but does not 
guarantee high immunization rates (21, 41, 65, 74). Dutton (21) and Rundall 
& Wheeler (65) showed that perceived service barriers, negative attitudes of 
consumers towards the type of health services available to the poor, and lower 
belief in susceptibility to illness were greater influences on the receipt of 
immunization by lower socioeconomic groups than the direct effect of pay- 
ment for services. 

In the United Kingdom, despite the availability of free medical care for all, 
use of preventive care services varies greatly between deprived and endowed 
communities (45). In a recent study, Peckham et al (58) mailed questionnaires 
to 1793 health professionals and 3394 parents, in seven districts with “high” 
and eight districts with “low” coverage for measles and pertussis vaccines. 
They developed scores for practice organization, physician knowledge, and 
parental attitudes and combined these with indicators of socioeconomic status 
in multivariate analyses. 

In the practice organization score, a point was assigned when all members 
of the practice team, and not only the physician, could obtain consent for 
vaccination, give injections, and conduct patient recall. In the physician 
knowledge score, a point was assigned for correct knowledge of con- 
traindications to each vaccine. In the parental attitude score, a point was 
assigned for each positive attitude expressed towards severity and infectivity 
of target diseases, efficacy, and safety of vaccines. 

Family and parental factors associated with low immunization uptake were 
low social class, large family size, presence of a chronically ill child in the 
family, and low parental attitude score. Of these, parental attitude score had 
the greatest influence. Only 30% of children whose parents had the lowest 
attitude score had received measles vaccine, and coverage increased to 90% 
among children whose parents had the highest score. Practice factors reducing 
immunization uptake were low practice organization score (measles vaccine 
coverage was 72% in practices scoring zero, and 90% in practices scoring 
high), and low physicians knowledge score (measles vaccine uptake was 77% 
among children who visited the lowest scoring physicians, and 90% among 
children who visited the highest scoring physicians). 

Each group of variables was interrelated. The practice organization score 
was positively correlated with both the physician knowledge and the parental 
attitude scores. Practice organization scores were lower in socially deprived 
areas. Low socioeconomic groups were thus served by practices that were less 
well organized and staffed by less knowledgeable practitioners. This rein- 
forced the parents’ negative attitudes towards the health services and led to 
incomplete immunization. 
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SUPPLY OF IMMUNIZATION SERVICES 


Published and unpublished reports document the failure of the health system 
to provide easily available and acceptable immunization services to persons in 
the lower socioeconomic groups. The system adversely affects immunization 
rates by creating barriers that restrict access and policies that impede atten- 
dance, thus missing opportunities to vaccinate persons who do attend and 
failing to use education and follow-up to keep children in the system. 


Barriers to Utilization of Immunization Services 


In May 1990, 57 immunization projects that receive federal grant funds, 
including projects in all 50 states, the District of Colombia, and some large 
cities and counties, were surveyed by the Centers for Disease Control. 
Fifty-four project managers responded (Table 1). Half stated that children 
were not receiving vaccines in one or more localities because of barriers, 
particularly insufficient clinic staff (70%), insufficient clinic hours (56%), 
and inaccessible clinic locations (15%). These barriers arose because of 
inadequate local resources for vaccine delivery. 

In addition, clinic policies, which tend to impede attendance, have been 
observed in areas affected by preschool measles outbreaks since 1986. Such 
policies include visits by appointment only; waits of several weeks for 
appointments; vaccination on only certain days of the week; limits on the 
number of clients registered per day; long waiting times; residency restrictions 
(not accepting out-of-county residents); the need for physician referral; com- 
prehensive physical examinations before vaccination; vaccinations adminis- 
tered only by physicians; charges for vaccine administration, either flat-rate or 
sliding scale; and the need to sign a statement of inability to pay a private 
physician (56). 


Table 1 Barriers to immunization identified by 27 of 54 
immunization program managers*” 








Potential barrier n % 





Insufficient staff 19 70 
Insufficient clinic hours 15 56 
Inadequate clinic location 4 15 
Appointment-only systems 235 9% 
Prior physical examinations required 15 56 
Physician referral needed 11 41 
Immunizations given only in well-baby clinics 37 
Financial screening/vaccine fees 6 22 





*27 program mangers did not identify any barriers to immuniza- 
tion in their immunization projects. 
>From Ref. 56. 
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Missed Immunization Opportunities 


Missed opportunities may occur when a child attends immunization services, 
but does not receive all the vaccines for which he or she is eligible 
(nonsimultaneous vaccination), or when a child who is eligible for vaccina- 
tion attends other health care services (e.g. acute care visits), but is not 
immunized. They may occur because of failure to screen a child’s immuniza- 
tion eligibility, failure to provide vaccination at the same locale as curative 
services, or inappropriate policies on contraindications to vaccination. 
Table 2 summarizes studies on missed immunization opportunities in the 
US. Record audits at emergency rooms (40), public clinics (24), and pediatric 
inpatient facilities (69) showed that up to 75.5% of children who attended did 
not receive all the vaccines for which they were eligible. A Utah study (39) 
indicated that illness of the child was a major reason for failure to immunize. 
In 1973, health department audits in Tennessee indicated that 10-24% of 
children had a missed immunization opportunity, most often because of delay 


Table 2 Studies on missed immunization opportunities in the US 








Number of Percent of Type of Percent with 
records children missed >=1 missed Ref. 
Site audited vaccinated opportunity opportunity number 





Health 294 78.6 Non-sim’+ 10.5 30 
department curative 

Health 133 42.1 Non-sim+ 24 
department curative 


Health a ; Non-sim 
department 


Health J Non-sim 
department 


Community z Non-sim+ 
curative 


Pediatric : Curative 
inpatients 


Emergency Zz ae Curative 
room 


1989 Public clinic fe : Non-sim+ 
curative 





*. . . no information 


>*only unvaccinated children studied 
*non-simultaneous vaccination 
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of measles vaccination for tuberculin testing (30). In Gainesville, Georgia in 
1986 (31) and in Virginia in 1987 (36), health department audits showed that 
21% of 21-23-month-olds and 19% of 24-month-olds, respectively, had a 
missed opportunity for simultaneous vaccination. 

Simplifying clinic procedures to reduce missed opportunities has dramati- 
cally increased adult immunization rates. A standing order, which gave nurses 
the responsibility to identify and vaccinate patients at a general medical clinic, 
increased influenza immunization rates from 28% before the intervention to 
81% during the intervention (42). Immunization rates in control clinics, in 
which a specific physician’s order was required, remained at 29%. 

Missed opportunities have also been reduced by systems that prompt 
providers to screen the vaccination status of all clinic attendees. In a busy 
internal medicine clinic, reminder questionnaires were attached to patient’s 
charts during the influenza season (25). Immunization rates for influenza and 
pneumococcal vaccines rose from 2.9% and 5.5% of eligible patients pre- 
intervention to 75% and 67%, respectively, during the intervention. In an 
infant primary care clinic, for children with appointments, clerks attached an 
immunization information label to the clinic note to be used by the physician 
(6). Receipt of third dose diphtheria-tetanus-pertussis/oral polio vaccine 
(DTP/OPV) by age 190 days increased significantly, from 25% preinterven- 
tion to 33% postintervention, even though approximately 50% of infants in 
the intervention period had unscheduled visits without appointments and thus 
did not have the label placed on the clinic note. 

The feasibility and acceptability of immunizing children at curative ser- 
vices has also been shown (72, 73). In a 1966-1967 study in a pediatric 
emergency room in Los Angeles County, a health aide or public health nurse 
completed a record of the immunizations required. The physician signed the 
record if there were no contraindications, and the nurse then vaccinated the 
child and gave a written return appointment. Follow-up was conducted seven 
days after a broken appointment, first by postcard, then by a second postcard, 
telephone, or telegram. Fifty-two percent of children kept their scheduled 
appointment. An additional 27% returned after the reminders (73). 

Potential immunization opportunities exist outside the health sector at other 
public assistance programs. Parents of children with measles were in- 
terviewed in Milwaukee, Chicago, Dallas, and Los Angeles in 1989-1990 
(34). Of 397 vaccine eligible measles cases, 40-91% were enrolled in one or 
more social programs. Children under one year of age were more likely to be 
enrolled in the Women, Infants, and Children (WIC) program; those one year 
or older were more likely to receive Aid for Dependent Children (AFDC). 

Most children have contacts with the health system. The National Medical 
Care Utilization and Expenditure survey showed that even among the poor, 
over 87% of children had at least one contact with a health care provider in the 
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previous year (9). These contacts are opportunities to immunize children and 
to keep the children in the system, by educating and motivating parents to 
return and by conducting active follow-up. 


Patient Education 


Although many studies have shown that physicians’ advice influences parents 
(1, 14, 18, 27), we have not found published intervention studies of the effect 
of clinic-based education. One paper has evaluated the effect of health 
education in the hospital setting, one through the mass media, and one in the 
school setting. 

A 1986 study in St. Louis compared three hospital-based educational 
interventions with a control group. Postpartum mothers were randomly 
assigned to receive an “immunization packet” (containing an immunization 
record for the child and a booklet on the importance of immunizations), the 
packet plus a video presentation, or both of these plus a phone call reminder 
when the child was two months of age. There was no significant difference 
between groups in the percentage that received first dose DTP/OPV (average 
94.5%) or three doses of DTP and two of oral polio (average 69%) (16). 

Peterson (59) reported no increase in vaccination activities after a 1975— 
1977 mass media campaign to promote routine immunization in Missouri. 
However, during an outbreak, vaccination activities increased when mass 
publicity was combined with the provision of extra clinics. 

School-based health education was evaluated in Denver before the school 
laws (71). Parents of 2028 children received a colorful pamphlet and newslet- 
ters about immunization. Students and parent-teacher organizations partici- 
pated in immunization-oriented projects. Three months after the education 
campaign, only ten of 569 immunization-deficient children had been im- 
munized. This contrasted to schools that sent reminders to parents and 
provided immunization on site, where 66% of 653 immunization-deficient 
children were immunized in the same period. 

Although none of these studies showed an impact of the educational 
program on immunization rates, the educational interventions were not de- 
signed after an assessment of the specific informational needs or of the most 
appropriate communication methods for those populations, and mass media 
techniques have developed considerably since 1976. It is, therefore, difficult 
to draw conclusions about the potential effectiveness of health education 
activities. 


Follow-up Systems 


Active follow-up of children can be conducted by general reminders, which 
provide information on the importance of immunization, or specific remind- 
ers, which inform parents of their own child’s needs and where 
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immunizations can be obtained. They may be distributed to all parents 
(universal) or only to “high risk” families, and can be sent before a scheduled 
appointment (“tickler systems”) or after a child has defaulted (“recall”). The 
increase in percentage of the target group receiving immunization after fol- 
low-up has varied greatly: 0-28% for general, universal reminders (3, 10, 16, 
46); 5-16% for general, high-risk reminders (15, 76); and 13-33% for 
specific reminders (51, 75; K. Tollestrup and B. B. Hubbard 1989, un- 
published data, Washington State Department of Health and Social Services). 
All studies of specific reminders showed significant increases in attendance. 
There was no consistent difference in response rates to mailed or telephone 
reminders. Most studies reported difficulty in locating families: Approximate- 
ly one third of families were untraceable because of inaccurate addresses, lack 
of telephones, or migration from the area. 

The cost of follow-up varied greatly. In the early 1980s, Yokley & Glen- 
wick (75) found that a specific letter combined with a lottery ticket incentive 
was more effective in the short term than a specific letter alone or combined 
with increased clinic hours, and all were more effective than a general 
mailout. However, by three months postintervention, the specific letter alone 
was the most cost-effective intervention, at an estimated cost of $2.27 per 
additional child immunized, compared with $6.91 for the specific letter plus 
lottery ticket, $6.28 for the specific letter and increased clinic hours, and 
$3.64 for the general mailout. 

The most effective reminder contains specific information about the child’s 
own vaccine requirements and reinforces health beliefs about susceptibility to 
disease. Mailed reminders appear equally effective as telephone reminders. 


DISCUSSION 


Studies on the receipt of immunization have used different methods and 
conceptual approaches. Few studies have examined the role of health beliefs, 
socioeconomic factors, and service delivery factors in equal depth. Further- 
more, many studies had potential methodological problems (49). Sampling 
bias and variation in determinants of receipt of vaccination across cultures 
sometimes limited the generalizability of findings (1, 4, 44). Many studies 
relied on self-reporting of vaccinations received, which may have led to 
misclassification of vaccination status (1, 4, 14, 18, 27, 38, 39, 50, 64). 
The interpretation of data on health beliefs is complex. In retrospective 
studies, it is difficult to attribute cause and effect to statistical relationships. 
Even with prospective studies, prior experience may shape consumer atti- 
tudes, and attitudes may not remain constant over time (8, 37). Major 
controversies, such as the Cutter incident, in which 260 paralytic poliomyeli- 
tis cases were caused by the use of lots of Cutter polio vaccine that contained 
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active virus (53), and the 1976 swine flu episode, during which cases of 
Guillain-Barre syndrome followed a mass campaign of swine flu vaccination 
(67), may have influenced attitudes towards vaccine safety in studies con- 
ducted shortly afterwards. Findings from adult immunization studies may not 
be applicable to the preschool population. 

Despite potential methodological problems, studies have given consistent 
results. Most children begin the immunization series; coverage of first dose 
DTP was over 90% in many studies (9, 16, 29, 39, 44). The major problem is 
failure to complete the immunization schedule on time. Lower socioeconomic 
groups are least likely to be immunized on time. Improved financial access to 
health care increases utilization, but does not eliminate socioeconomic differ- 
entials, because of other barriers associated with the health services available 
to the poor and near-poor and the negative attitudes towards health services 
that these barriers create. Favorable health benefits affect intention to obtain 
vaccination, but may be neither sufficient nor necessary conditions for 
vaccination (50). 

The causes of low immunization coverage among preschoolers are multi- 
factorial, and studies to date have not demonstrated which factors are the most 
important. However, the characteristics of persons who are least motivated to 
obtain timely vaccinations for their children are known, as are the characteris- 
tics of the health services that deter these families from seeking vaccination. 
Although we are not able to say which single factor is the most important in 
predicting receipt of immunizations, we have identified potentially correct- 
able causes of low immunization coverage among preschoolers. 

This review has identified five priority areas for improvement of service 
provision within the existing health care system. First, barriers to immuniza- 
tion must be removed by increasing clinic staff and clinic hours of operation. 
Second, clinic policies should allow vaccination on demand and minimize 
bureaucratic obstacles. Third, existing contacts with families must be used to 
immunize, through simultaneous administration of vaccines and screening of 
eligibility and vaccination at curative services. Fourth, active follow-up, with 
specific, rather than general, reminders, should be conducted. Fifth, other 
potential contacts with the high-risk population should be exploited. Im- 
munization should be linked to such services as WIC, AFDC, or housing 
programs. Where resources permit, immunization should be provided on site. 
At a minimum, the child’s immunization status should be assessed and the 
parent should be referred for immunization. 

Concurrently with action to improve the provision of immunization ser- 
vices, operational research should be conducted to clarify issues relating to 
consumer demand. The influence of clinic organization and health workers’ 
knowledge and attitudes on consumer behavior in the US should be studied 
and appropriate interventions should be developed. In depth studies of differ- 
ent methods of health education should be conducted. Such methods include a 
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home-based immunization record, which is designed to educate the mother 
about the immunization schedule and remind her of appointments (as in 
developing countries), and clinic-based education by persons who have au- 
thority that is recognized by parents. Positive incentives, such as the lottery 
tickets used by Yokley & Glenwick (75), may be helpful in the short term to 
raise coverage quickly during an outbreak, for example. Their long-term 
cost-effectiveness requires careful evaluation. Links with influential commu- 
nity groups should be explored, and the applicability to the US of community 
mobilization techniques used in developing countries should be evaluated. 

To implement these recommendations, resources must be invested in the 
public sector. Increases in federal assistance to the immunization program in 
the 1980s have only compensated for the exponential increase in vaccine costs 
and have not always been matched by increased state resources. Federal 
funding for such programs as the Maternal and Child Health Services block 
grant and the Community Health Centers program decreased in real terms by 
11-43% between 1978 and 1986, and only a minority of those in need are 
reached (70). Recent changes in the Medicaid program have the potential to 
increase coverage of pregnant women and infants (54). However, unless 
reimbursement fees are increased and administrative procedures are stream- 
lined, pediatrician participation in Medicaid will probably continue to fall 
(77). In the long term, there is a need to work towards a universal, equitable 
health care system (19, 22, 54, 61). 

The World Health Organization has emphasized the need for political will 
to achieve universal childhood immunization (23). Though it is tempting for 
health care providers to attribute low immunization uptake to consumer 
apathy, much evidence points to correctable deficiencies of the health care 
system. The nation’s preschool immunization objectives will only be reached 
if society has the commitment to respond to needs. 
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INTRODUCTION 


Pharmaceuticals contribute a small share of total health care expenditures, but 
have nonetheless generated a large share of public sector regulatory attention. 
In 1986, drugs and medical sundries constituted only 6.5% of total health 
expenditures, which represents a decline from a 13.6% proportion in 1950 
and 1960, a 10.7% share in 1970, and 7.6% share in 1980 (10). Even in terms 
of annual changes, expenditures for pharmaceuticals are modest when com- 
pared with those for all other categories, although growth rate for phar- 
maceuticals has exceeded the consumer price index since 1975. As overall 
health expenditures rise, however, and states attempt to constrain ever-rising 
costs of their Medicaid programs, any major expenditure category must be 
considered for reductions. As a result, pharmaceuticals appear an attractive 
target to cost-containment mechanisms. This paper examines one approach to 
the attempted reduction in Medicaid program costs—the imposition of restric- 
tions on access to pharmaceuticals reimbursed through Medicaid programs. 
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BACKGROUND 


In January 1966, the United States Social Security Act, Title XIX was 
enacted, thus creating the Medicaid program. Federal matching funds were 
provided to states for the provision of basic medical care to low income 
populations. Medicaid, along with the Medicare program, emerged as the 
largest public assistance system in the US. Because Medicaid programs are 
entirely administered by the separate states, Medicaid eligibility requirements 
are among the most complex of any public assistance programs. States may 
also elect to provide optional services. 

Under certain federal conditions, states that provide optional services may 
impose limitations on service delivery by utilizing cost-sharing requirements 
and other stringent restrictions. One optional service, prescription drug cover- 
age, is offered as a benefit by almost all state Medicaid programs, except 
those in Alaska and Wyoming (8). However, significant variation in these 
states’ prescription drug programs exists. For example, imposition of copay- 
ments, limitation in the number of prescriptions per recipient, and reimburse- 
ment restrictions of varying degrees of severity are common among states that 
provide this service. 

Medicaid programs have been faced with shrinking financial resources, 
while also facing increases in program expenditures caused by health care 
inflation. Currently, Medicaid accounts for approximately one third of state 
and local government health care expenditures, and often represents the 
largest program in a state’s budget (9). The political climate regarding rising 
health care costs has provided Congress, as well as state legislatures, with the 
impetus to restrict government spending on all health programs. In 1981, 
Congress passed the Omnibus Budget Reconciliation Act, which effectively 
cut federal support for Medicaid and conferred to states greater flexibility and 
responsibility for developing cost-containment policies. 


Medicaid Drug Formularies 


Responding to this intense economic pressure, states have implemented cost- 
containment measures to control, for example, prescription drug costs. Many 
states have implemented drug formularies, which are statewide lists of basic 
drugs, as criteria for Medicaid reimbursement to reduce drug and overall 
Medicaid costs. Thus, restrictions on the use of certain drug products or entire 
therapeutic categories are often adopted. 

Drug formularies are categorized as either open or restricted. Restricted, or 
closed, drug formularies are characterized by a state’s policy not to pay for 
prescription drugs unless they are specifically listed. Open drug formularies 
provide prescribing guidelines that generally pay for all prescribed drugs, 
regardless of their inclusion on a drug list. States with open formularies, 
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which provide some guidance in drug reimbursement, as well as states with 
no formal drug reimbursement policy, are referred to as open formulary 
States. 

It is difficult to compare Medicaid formularies across states accurately, 
because of differences in important related administrative policies. For ex- 
ample, some states with a restrictive drug formulary may provide other 
mechanisms, such as prior authorization protocol, to enable Medicaid patients 
to have access to nonformulary drugs. 

Prior authorization is an example of treatment authorization requests re- 
quired by Medicaid for additional treatment or, in this case, prescription drugs 
that are not specified as reimbursable by the program. Prior authorization for 
prescription drugs requires that a physician or pharmacist document the need 
of a Medicaid patient for an unreimburseable drug product. This request is 
then submitted, usually by telephone or mail, to a Medicaid field office for 
approval or denial. If the majority of prior authorization requests are 
approved, a restrictive formulary with prior authorization may actually pro- 
vide more availability of drugs to beneficiaries than would a less restrictive 
formulary with no exemption process. California, for example, is a state that 
has a restrictive formulary, but also administers a formal prior authorization 
program, the treatment authorization request (TAR). 

Another characteristic of a drug formulary is the concept of positive versus 
negative formulary lists. A positive formulary is a list of drugs that are fully 
reimbursable by the payor of services. This type of formulary usually lists 
drugs of choice based upon scientific and clinical evidence. Choices are 
usually determined with strong medical and pharmaceutical input on the basis 
of cost, efficacy, and safety. In contrast, a negative formulary is a list of drugs 
or therapeutic categories that are not reimbursable by the payor of services. 
For example, drugs that treat anorexia or weight gain, fertility, hair growth, 
smoking cessation, and coughs and colds are often included on negative 
formularies. 


Literature Review on Medicaid Drug Formularies 


Research regarding Medicaid drug formularies has centered around the issue 
of restrictive versus open formularies. Significant controversy has been elic- 
ited, as restrictive formulary proponents commend fiscal restraints, while 
opponents criticize drug formularies as an obstacle to the efficient practice of 
high quality medicine. Additionally, wide variation of conclusions in the 
literature concerning the effect of restrictive formularies makes interpretation 
difficult, especially because most studies (including 1, 4, 6, 7) are funded at 
least in part by the pharmaceutical industry. 

Although many studies of the effects of restricted drug access exist, their 
results are not generalizable, because of differences in the state Medicaid 
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programs analyzed. Among the current research studies, two types of analy- 
ses have been employed: single state studies and multistate studies. 

Single state studies are the most common among Medicaid formulary 
evaluations. However, because these analyses are limited to the experience in 
a single state, they tend to reflect the idiosyncracies of formulary restrictions 
in the state, e.g. the prior authorization process or TAR in California. In 
addition, single state studies, as a result of comparing expenditures before and 
after a formulary policy change, inadvertently attribute any differences to that 
change without critically examining other policy or environmental changes 
that might concur with the state specific formulary policy. Therefore, con- 
clusions are often difficult to validate as true causal effects. Finally, pre-post 
single state studies monitor only short-term effects of formulary restrictions, 
which does not consider long-term consequences of new physician prescrib- 
ing or treatment patterns. 

Factors that are often controlled in multistate studies, but lack in single 
state studies, include program size, composition, and reimbursement trends. 
These variables are important, because Medicaid program costs are a function 
of program size, therefore subject to changes in the number of eligibles. 
However, even per-eligible costs vary with the severity and duration of 
economic cycles. Additionally, the composition of those persons on Medicaid 
varies, which is particularly evident in California and New York, two very 
populous states facing the AIDS epidemic. Also, state reductions in Medicaid 
reimbursement levels affect Medicaid access, as the number of participating 
Medicaid providers changes according to reimbursement levels. 

In the following literature review, we only discuss published studies. Thus, 
seven studies are examined, of which two (4, 7) are published by the National 
Pharmaceutical Council. Additionally, we separate formulary studies on the 
basis of Medicaid cost results and Medicaid drug access. We first present 
Medicaid cost studies, which are further classified into multistate and single 
state studies. 

In 1972, Hammel (3) conducted an early multistate study, which examined 
open and restrictive formularies. His evaluation examined Medicaid ex- 
penditures of nine Western states and ten Southern states, by using data 
provided by the Department of Health, Education, and Welfare. The 19 states 
were categorized as either open or restrictive, and Medicaid expenditures 
were compared between those classifications. Hammel concluded that in both 
the West and South, restrictive formulary policies resulted in higher state per 
capita Medicaid expenditures than did open formularies. This study is weak, 
however, because it is old and insufficient in controlling differences in 
eligibility requirements and variations in benefits among the different West 
and South Medicaid programs. 

The multistate study by Smith & Simmons (7) utilized a combined time 
series cross-sectional methodology to examine the effects of formulary limita- 





PHARMACEUTICALS REIMBURSEMENT 403 


tions in Medicaid drug programs from 1973 to 1980. A multivariate analysis 
was developed that included independent variables specific to formulary 
restrictions: drug price, utilization limitations, and other Medicaid policies. 
Medicaid drug expenditures per eligible and recipient, as well as a partici- 
pation rate ratio of recipients to eligibles were used as the dependent vari- 
ables. 

Smith & Simmons did not find strong results in their model. Independent 
variables were highly intercorrelated, thus giving imprecise parameter es- 
timates. Additionally, the multivariate analysis results were mixed regarding 
formulary restrictions. In some cases, formulary restrictions appeared to 
reduce Medicaid expenditures on drugs; in most cases, however, formulary 
restrictions were associated with an increase in Medicaid expenditures. This 
conclusion was not statistically convincing, because of the high multi- 
collinearity between the independent variables. 

A more recent multistate expenditure study, by Schweitzer et al (6), 
investigated the relationship between formulary restrictiveness and Medicaid 
expenditures. The authors tested the hypothesis that restrictive drug access 
lowers total expenditures. They used cross-section regression analysis to 
determine the financial impact of restrictive drug formularies by analyzing 
drug expenditures as a function of formulary policy and other variables. 
Seven states—California, Illinois, Kentucky, Mississippi, New York, South 
Carolina, and Washington—were analyzed. 

Schweitzer et al found that restrictive formularies did not appear to reduce 
Medicaid drug expenditures. But, in contrast to the findings of the earlier, 
more limited studies, the authors observed that total Medicaid expenditures 
were reduced in restrictive formulary states. The authors reasoned that the 
restrictive formularies may not directly cause this reduction in expenditures, 
but may merely represent a proxy for general Medicaid restrictive cost- 
containment programs. 

Schweitzer et al’s contrasting results of reduced total Medicaid expenditure 
may be attributed to a more thorough control of external variables. Schweitz- 
er’s regression model controlled to a greater extent such factors as medical 
practice patterns, health status, demography, and prevailing illness patterns 
among each state’s total population, not just the Medicaid population. Thus, 
he better accounted for interstate differences in patterns of medical care, 
health care costs trends, demographic differences, and variations in morbidity 
across states. Additionally, Schweitzer et al examined a longer period (ten 
years), thus monitoring longer term effects than other multistate studies that 
utilized shorter time series. 

Although Schweitzer et al’s research is plausible and adequately controls 
for numerous variables, the study fails to link the association between restric- 
tive drug access and lower overall Medicaid expenditures to a causality. 
Schweitzer addresses this issue in his conclusion by stating, “The associations 
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do not appear to be simple or obvious.” Schweitzer’s study also differs from 
the other multistate studies, in that substitution effects are not considered. 

Several unpublished studies of specific drug restrictions determined that if 
close substitutes for a drug are not on the formulary, the total cost of treatment 
for an episode of illness can increase dramatically. One must weigh the higher 
cost of a new drug against the cost of other health services, such as physician 
office visits and hospital admissions, which often act as substitutes and are 
orders of magnitude more expensive. The net effect depends on many factors, 
such as the extent of potential unnecessary use of the more expensive drug and 
the likelihood that nonuse will lead to the other increased services. For some 
exclusions, it is certainly possible that the net costs may rise. Whether this is a 
broadly observed phenomenon depends on the characteristics of a formulary. 

Although single state studies are less generalizable than multistate studies, 
their relative abundance provides interesting state specific information. Meyer 
et al (5) evaluated Tennessee’s Medicaid drug program qualitatively. They 
described Tennessee’s attempt to adapt features of an unrestricted drug for- 
mulary and drug utilization review. The authors supported Tennessee’s efforts 
and anticipated that the program would benefit the citizens of the state. 
However, because this study lacked adequate quantitative validation, the 
conclusions should be viewed with skepticism. 

In 1980, Hefner (4) conducted a study that compared two states, to control 
for the effects of general trends by Medicaid recipients’ use of services. 
Hefner compared Texas, which has an open formulary, with Louisiana which 
has a closed formulary. Louisiana instituted a negative formulary in July 
1976, which excluded anorexics, cough and cold remedies, minor tranquiliz- 
ers, multiple-ingredient anti-anemia preparations, certain gastrointestinal 
drugs, certain vitamins, enzymes, and other miscellaneous products. 
Louisiana Medicaid officials estimated that the restrictive formulary would 
decrease drug expenditures by 15.68% and save $5.6 million. 

Hefner’s method involved two approaches: a longitudinal study of the 
Medicaid program utilization patterns before and after the 1976 Louisiana 
restrictions and integration of the frequency of disease patterns associated 
with provider encounters into the longitudinal results. Hefner utilized an 
18-month study period, which was divided into three six-month periods: a 
six-month preperiod or control period, a six-month adjustment period, and a 
six-month comparison period. Hefner also matched samples of eligibles for 
the control and comparison study periods by population characteristics, such 
as aid category, sex, age, race, and residence. Additionally, Hefner examined 
specific units of service that represented the largest cost items in the Medicaid 
program. 

Hefner’s study found that cost increases associated with increased utiliza- 
tion for other medical services were 3.5 times the savings from a restrictive 
drug formulary. Although this study’s method to control for general trends 
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was an improvement over an earlier Hefner unpublished single state study, he 
failed to distinguish the effect of other nonpolicy influences on Medicaid 
program costs. Also, the implementation of Louisiana’s negative formulary 
policy was not viewed as substantial enough to warrant such large increases in 
hospitalizations in a relatively short period of study. 

In 1989, Dranove (1) evaluated the cost-effectiveness of formulary restric- 
tions in Illinois. In 1984, Illinois’s Department of Public Aid eased restric- 
tions on anti-infective drug products for Medicaid reimbursement. Dranove 
studied this new policy effect on both the use of physicians’ services and the 
cost of treating bacterial infections. Dranove constructed a sample of Medi- 
caid ambulatory patients with bacterial infections in 1983 and 1984. By using 
regression analysis, Dranove concluded that the number of physician visits 
per patient decreased after the addition of new anti-infective drug products. 
However, he also concluded that this physician utilization decrease did not 
offset higher drug costs of the new policy. Although Dranove’s method for 
assessing the impact of new policy is interesting, a strong link between 
relaxing formulary restrictions and increased utilization of other services lacks 
controls for other influences. Additionally, Dranove did not include an 
evaluation of the formulary change on inpatient hospital costs that could have 
strengthened his finding of increased utilization of other services. 

Because Dranove’s study lacks statistically significant results and excludes the 
impact of formulary changes on inpatient hospital costs, his conclusions on 
decreased physician visit utilization with the addition of new anti-infective drug 
products is not convincing. Dranove’s results indicate that no association exists 
between easing formulary restrictions and physician utilization and cost. 

Although there is evidence, albeit tentative and statistically weak, that 
Medicaid expenditures may actually rise in response to formulary restrictions, 
the overall view is indeterminant. Numerous narrowly focused studies derive 
an opposite association between restrictive Medicaid formularies and ex- 
penditures, whereas the more thorough, multistate Schweitzer study finds an 
association, but cannot attribute causality. In conclusion, there is not a strong 
case that formularies either raise or reduce Medicaid expenditures. 


NEW RESEARCH AND FUTURE DEVELOPMENT 


Drug formularies have become an increasingly popular cost-minimization 
strategy. Supporting this trend is the fact that in 1988, only four of 48 
Medicaid programs had completely “open” formularies, which reimbursed for 
all Food and Drug Administration (FDA) approved drug products (8). In the 
remaining Medicaid programs, approval of a new drug product by the FDA 
does not guarantee Medicaid recipients access to new drugs. Products must be 
approved by Medicaid programs before reimbursement is secured. 

In the case of California’s Medicaid drug formulary, the formulary is 
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updated by an ongoing process of adding newer drug products and deleting 
older, inferior drug products. The Department of Health Services uses five 
criteria: cost, essential need, safety, efficacy, and misuse potential. Each 
state’s formulary drug determinations vary depending on the state’s policy. In 
general, the criteria seem to be based on cost and availability of substitutes; 
however, substitutes are often not very close. 

In the area of social policy and Medicaid drug formularies, Schweitzer et al 
(6) and Grabowski (2) examined Medicaid beneficiaries’ access to new drug 
products. The Schweitzer et al study investigated the “social drug lag” 
between the time a drug is approved for marketing by the FDA and the time it 
is available to a state’s indigent population through the Medicaid program. 

In this study, Schweitzer established a drug lag index for comparing states 
and analyzing time trends. In particular, the drug lag index measured the 
fraction of time that a new drug is available to a state’s Medicaid beneficiaries 
during the first two years of market life from FDA approval. 

Schweitzer found that from 1970 to 1980, the FDA approved 120 new drug 
products for marketing. But, the approval of these new drug products adopted 
by formulary states ranged from 19% to 73%. Of the seven states in the study, 
Illinois, Washington, South Carolina, Mississippi, and New York had rela- 
tively short average lengths of time for approval. In the other two states, 
California and Kentucky, the average lengths of time for approval were 
relatively long. 

The range in the approval lag was very large. For those drugs that were 
eventually approved, Kentucky, averaged more than five years for approval, 
whereas Washington averaged a littlke more than one year. Additionally, 
Schweitzer et al discovered that the drug approval lag trend decreased over the 
period of study for all states combined. 

Grabowski’s study also examined the effects of drug formularies on the 
availability of new drugs to Medicaid beneficiaries. The study analyzed the 
impact of drug formulary time delays on the marketing exclusivity periods 
and related factors that influence drug innovation incentives. Grabowski’s 
research showed three categories of drugs to be the least available: psy- 
chotherapeutics, anti-infectives, and antifertility products. Grabowski also 
found that a typical new drug product was available to Medicaid patients only 
two of the first five years of market life. Thus, the study showed that 
Medicaid patients in restrictive formulary states had significantly restricted 
access to new drug products. Furthermore, restrictions were not limited to 
duplicate drug products, but included drugs that exhibited strong, non- 
Medicaid market performance, as well as high FDA therapeutic importance. 

As demonstrated by Schweitzer’s and Grabowski’s research, wide variation 
exists in the restrictiveness of state formularies. In some cases, the delay 
between the first marketing of a new drug product and its availability to 
Medicaid beneficiaries lasts only months; in other states, the lag is frequently 
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several years. And, in many states, products are never approved for Medicaid 
reimbursement. 

A recent unpublished study by Grabowski and colleagues (1991, The 
Medicaid Drug Lag: Adoption of New Drugs by State Medicaid Formulas, 
Duke University), which was funded by Glaxo Pharmaceutical, investigated 
indigent patients’ access to new drugs in various states with Medicaid 
formularies. This study was an extension of the two previous analyses of the 
Medicaid approval process by Schweitzer et al and Grabowski. In this paper, 
however, a larger sample of states with Medicaid formularies was examined 
for a longer time period, 1970-1985. Time trends were presented for this 
extended period and for the more recent period of 1979-1985. As in Gra- 
bowski’s earlier study, therapeutic categories and market sales were also 
investigated. The experiences of nine states with Medicaid formularies (Cali- 
fornia, Illinois, Kentucky, Mississippi, Missouri, New York, South Carolina, 
Tennessee and Washington) were examined in more detail. 

The study found that Medicaid patients continue to face significant restric- 
tions regarding new outpatient drugs. Although a positive trend toward 
increasing availability was observed during the 1970s, this was not the case in 
1979-1985. In this period, the typical new drug compound experienced lags 
of 20 months in securing a position onto these formularies and were available 
less than 40% of the time during the first four years of market life. Further- 
more, certain categories of drugs, such as anti-infectives and psychopharma- 
cologics, were particularly restricted by Medicaid formularies. New drugs of 
commercial and therapeutic importance in these and other therapeutic categor- 
ies also experienced significant restrictions on availability. In addition, great- 
er drug availability was observed for drug products with higher market sales. 

The experience with respect to Medicaid formularies varied dramatically 
across the nine states. As noted previously, California had the most restrictive 
formulary of these states, as only about one third of the FDA approved new 
drugs gained acceptance onto the Medicaid formulary. These acceptances had 
a lag of roughly four years from the date of first marketing approval. By 
contrast, New York, which had a closed Medicaid formulary only since 1977, 
exhibited an acceptance rate of over 80%, with an average time delay of only 
eight months. 

After constructing a picture of California’s and New York’s Medicaid drug 
adoption process from interviews and literature, the difference between re- 
strictive and liberal policies appears less distinct. Although, California’s 
Medi-Cal formulary does not offer as many new drug products during the first 
four years of market life as New York, California’s prior authorization 
program enables some access to nonformulary drug products. However, it is 
not clear whether California’s 70% approval of prior authorization requests 
significantly improves Medicaid beneficiaries access to new drug products or 
merely creates further obstacles to an already restrictive drug formulary. 
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New Legislation 


Congress’s concern over the cost and accessibility of pharmaceuticals to 
Medicaid programs led to sweeping legislative reform in 1990. This legisla- 
tion, the Prudent Purchasing Act of 1990 [Public Law (PL) 101-508] contains 
several parts. Beginning January 1, 1991, federal Medicaid funds may be 
withheld from states for prescription drugs if the drug manufacturer of a 
Medicaid reimbursed drug product does not enter into an agreement to 
provide specified quarterly rebates to states. In response to this mandate, the 
federal government has offered to assist states with start-up administrative 
costs by providing 75 cents, rather than the usual 50 cents, of each dollar 
spent on the program in fiscal 1991. 

In addition to the rebate requirement, the federal legislation specifies a list 
of drugs that a state may exclude from coverage, such as drugs that treat 
anorexia or weight gain, fertility, smoking cessation, and coughs and colds. 
The Health and Human Services Secretary is required to update the list 
periodically to add or delete drugs determined to be clinically abusive or 
inappropriately utilized. 

The federal rebates are applied to single source and multiple source drugs. 
There are two possible rebates for single source drugs: the greater of 12.5% of 
the average manufacturer price (AMP) or the difference between the AMP 
and “best price,” which the manufacturer sells to any other customer. How- 
ever, this legislation exempts from best price contract sales to the Department 
of Veterans Affairs. Rebates for multiple source drugs, better known as 
generics and over-the-counter drugs, are 10% of the AMP for 1991-1993 and 
11% thereafter. To facilitate the best price agreement, manufacturers, 
wholesalers, and direct sellers of drugs must provide pricing information, 
which is termed confidential. If this legislative requirement is not fulfilled, 
violators are subject to fines of up to $100,000. 

States receive manufacturer rebates by submitting, no later than 60 days 
after the end of each calendar quarter, information on the total number of 
dosage units of each covered outpatient drug dispensed under the Medicaid 
discount plan during the quarter. States collect dosage information from 
pharmacies that submit Medicaid drug reimbursement claims to the state. 
Under these operating procedures, pharmacies are not directly affected, be- 
cause reimbursement rates remain unchanged between the state and pharmac- 
ies. After submitting the rebate claims to the manufacturers, the state receives 
the appropriate rebate from the manufacturer. Thus, the states benefit from 
the rebates, while maintaining business as usual with the pharmacies. 

A prior authorization mechanism is required through which states grant 
specific permission for use of a nonformulary drug before reimbursement is 
permitted under the new legislation. States that currently have a prior approv- 
al program are required to respond to a drug request within 24 hours, with the 
assurance that a patient have access to a 72-hour emergency supply. 
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Drugs newly approved by the FDA are not eligible for prior authorization 
until after six months of availability and, in the case of FDA 1-A (major 
therapeutic innovations) classified drugs, prior authorization requirements are 
clearly restricted. 

The federal legislation has also addressed issues of generic substitution, 
pharmacy reimbursement, and, most interesting, utilization review issues 
along with education. Under this new legislation, prospective and retrospec- 
tive drug review is required. Prospective drug review specifies screening for 
potential problems caused by drug interactions, contraindications, allergies, 
dosage forms, and misuse factors. Retrospective drug review focuses on 
analyzing claims data and other records for fraud, overutilization, and in- 
appropriate use by physicians, pharmacists, and patients. Acquired informa- 
tion from these two utilization programs will assist the states in required 
outreach to educate physicians with the goal of improving prescribing and 
dispensing practices. 

It is interesting to examine policy changes in California, which has had the 
most restrictive formula in the country (2, 8; Grabowski et al 1991, un- 
published). In July 1990, new California legislation mandated a drug discount 
program to be implemented immediately by California’s Medi-Cal phar- 
maceutical program. Close behind California’s drug discount legislation, 
Congress passed a similar drug discounting policy applicable to all Medicaid 
drug formulary programs. 

California’s legislation differs from the federal best price criterion, because 
it does not specify a target percent discount, but rather allows for flexibility in 
negotiations. The California legislation states that best price means the negoti- 
ated price, or the manufacturer’s lowest price available to any class of trade 
organization or entity. In addition, California does not exempt the Department 
of Veterans Affairs from the contracted best price as specified in the federal 
legislation. 

The anticipated benefits of the new legislation are not only to decrease 
Medicaid program costs, but to provide the possibility of increased Medicaid 
beneficiaries’ access to new drug products. After preliminary evaluation of 
California’s drug discount legislation, it appears that the original Medi-Cal 
drug formulary list has significantly expanded in a shorter period of time than 
previously witnessed. Additionally, the prior authorization program has bene- 
fited from an increase in state funds to improve service response time to 
providers. This improvement is difficult to evaluate so soon after phase-in of 
this legislation, but is viewed as a positive step in alleviating the burden of the 
prior authorization process to providers. 

Although decreased program costs will produce a windfall for state Medi- 
caid programs, analysts predict that pharmaceutical prices may increase for 
other payors. Specifically, a shift in prices may adversely affect private pay 
patients, including Medicare beneficiaries who do not qualify for Medicaid, if 
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pharmaceutical companies attempt to recover lost revenue through price 
discrimination. However, this scenario may not be realistic if pharmaceutical 
companies are pricing correctly in the market. Thus, the issue of drug pricing 
may come under closer scrutiny. Additionally, pharmaceutical companies 
may face unintended risks of Medicaid discounting, as other public and 
private customers may increase pressure for similar discounts. 

With legislation so new, it is impossible to evaluate the impact of PL 
101-508 on either Medicaid drug program costs or prescribing and dispensing 
patterns. Also, there is little information on whether this new legislation will 
reduce the wide variation across states in Medicaid program access to new 
drug products. 

In summary, the literature review of restrictive Medicaid formularies and 
its effects on Medicaid expenditures is inconclusive. Although many studies 
conclude that Medicaid expenditures may actually increase in reaction to 
formulary restrictions, this evidence is tentative and statistically weak. 
However, the conclusion that restrictive formularies delay access of new drug 
products to Medicaid beneficiaries is supported in the literature. Also, this 
delayed access varies dramatically across restrictive Medicaid formulary 
states, which seems to reflect an individual state’s administration of their 
formulary. Future issues regarding Medicaid formularies points to new cost- 
containment developments, such as the new Medicaid discount legislation. As 
a result of the pressure on federal and state governments to control high health 
care costs, more cost-containment measures, such as this legislation, will be 
forthcoming. Understanding the impact of these changes on cost and access is 
important for future health policy legislation and program implementors. 
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Symposium on Selected Clinical Syndromes 
Associated with Aging 


Introduction, Gilbert S. Omenn, Symposium Editor 


The aging of society in the developed countries is the dominant demographic 
phenomenon of our time. The number of persons aged 65 and over in the 
United States has grown from 3.1 million in 1900 (4% of the population) to 
over 30 million today (12%). Projections indicate that there will be 35 million 
in the year 2000, 39 million in 2010, 52 million in 2020, and 66 million (22% 
of the population) in 2030 as the post-World War II baby-boom population 
ages. The percentage growth in the over-85 population is even faster. Thus, 
there is understandable concern among health economists, bugeteers, and 
planners—and among physicians, as well—that the high utilization of medi- 
cal care by an aging population will reinforce the runaway health care 
inflation of recent years. 

This section of the Annual Review of Public Health reflects a strongly held 
view of our Editorial Committee and a growing perception nationally that 
prevention has a major place in the care of older women and men. We have 
chosen several major clinical problems of older people—acute confusional 
states, cognitive impairment, physical inactivity, falls, and nonfall injuries— 
to explore the multifactorial causes and the evidence for effective preventive 
measures. The authors provide authoritative reviews of the issues and evi- 
dence. 

Research on health promotion and disease prevention in older men and 
women was long neglected. In retrospect, it is striking that the major car- 
diovascular prevention trials reported in the 1970s and 1980s enrolled only 
middle-aged men—in MRFIT, men aged 35-57 and in the Coronary Primary 
Prevention Trial of the Lipid Research Centers, men aged 35-59. Since the 
establishment of the National Institute on Aging, the National Institutes of 
Health has been rectifying this deficiency. 

Throughout the Public Health Service, attention to older people in health 
promotion/disease prevention programs has been growing. The 1989 Guide to 
Clinical Preventive Services included a specific schedule of screening, 
counseling, and immunization services recommended for older adults. The 
highest priority is to improve functional status and quality of life, rather than 
just extending life. The 1979 report of the Surgeon General, Healthy People, 
and Health Objectives for the Nation 1990 the following year set as the 
primary goal for older adults 25% reduction in the number of days of 
restricted activity. In Healthy People 2000, the first overarching goal is to 
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“increase the span of healthy life for Americans.” The 1980 figures were 73.7 
years for life expectancy at birth, of which 11.7 years were dysfunctional, on 
average. 

There is ample time for effective health promotion actions to take effect in 
older people; for example, women and men who reach age 65 can expect to 
live into their eighties. But, there is no assurance that the same risk factors 
have the same relative importance in older people as in middle-age. Specific 
research is required. What is known for several key risk factors and pre- 
ventive interventions is well-summarized in the February 1992 Clinics in 
Geriatric Medicine, which was designed to be complementary to this Sym- 
posium. As with younger age groups, however, the biggest challenge is to 
entice those who most need to participate—sedentary, smoking, low-income, 
socially isolated, sensory-impaired, multiply medicated, or depressed in- 
dividuals. As always, those of us in public health and preventive medicine 


must emphasize ways to reach the whole population and especially those most 
at risk. 


Preventive actions, in both clinical and community arenas, seem to face far 
more scrutiny of their likely costs than do diagnostic tests and treatments. 
Total costs of population-based health promotion programs can become rather 
substantial if screening tests, confirmatory tests, and counseling are provided 
to large numbers of people to prevent relatively few adverse events per year. 


The benefit-to-risk ratio drops further when future benefits are discounted 
against present costs, if high-risk individuals are disproportionately missed, if 
identified individuals fail to follow through with recommended behavior 
changes or medicines, or if the target population is heterogeneous for con- 
ditions screened and for appropriateness of interventions. 

As I have noted in the Summer 1990 issue of Health Affairs devoted to 
prevention, generalized analyses of health promotion and disease prevention 
programs can be misleading. It is essential to specify the target populations by 
age, sex, racial and ethnic group, underlying incidence of predisposing 
preventable risk factors, portion of the risk attributable to each of those factors 
and their combinations, willingness to participate, and compliance with rec- 
ommendations. 

The classic differentiation among primary, secondary, and tertiary preven- 
tion efforts is useful, too. Primary prevention aims to avert the initiation of the 
disease process; secondary, to detect early signs of disease before the person 
is Clinically affected; and tertiary, to prevent serious and often costly com- 
plications of already-diagnosed disease. Some of these considerations make 
the benefit/cost ratio of efficacious preventive services potentially more favor- 
able among older men and women than in their middle-aged counterparts. 

Primary prevention may seem less useful in older adults if it must precede 
the onset of the disease process by many years. However, smoking cessation 
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seems to reduce coronary mortality promptly and with similar magnitude in 
older and middle-aged individuals; the aggregate benefit per 1000 persons is 
much higher in older adults because the mortality rate is so much higher. 
Because incidence, morbidity, and mortality rates of the common cancers rise 
sharply with age, any secondary prevention program that works in older 
adults should generate more lives saved per 1000 people screened. 

Two major efforts are under way to assess cost-effectiveness of various 
preventive interventions for Medicare-eligible elderly persons. The first is a 
series of literature reviews and analyses by the Congressional Office of 
Technology Assessment. The second is a set of demonstrations in North 
Carolina, southern California, Baltimore, Pittsburgh, and Seattle to assess the 
costs, potential cost savings, and changes in health-related quality of life after 
introduction of experimental packages of preventive services. 

It is ridiculous to expect health promotion and disease prevention to 
accomplish grand-scale cost containment in the health care sector in the face 
of continuing escalation of expenditures for diagnosis, treatment, and long- 
term care. However, it is reasonable to expect well-selected health promotion 
and disease prevention initiatives to achieve improvement in health status, 
maintenance of functional independence, and moderation of increases in 
health care expenditures. 
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INTRODUCTION 


Cognitive impairment in the elderly represents a major public health problem 
as the proportion of elderly persons in the population increases. Epidemiolog- 
ic data show that 1.4% of persons aged 65—74 years have a dementing illness; 
this figure rises to 20.8% for persons over age 85-89 years (45, 91). 

Acute confusional state (ACS), or delirium, which is a form of dementia, 
differs from Alzheimer’s disease by its speed of onset. Also, the delirium 
patient may fluctuate between full alertness and coma, and the condition is 
usually reversible (21). Lipowski (56) has defined delirium as “a transient 
organic mental syndrome of acute onset, characterized by global impairment 
of cognitive functions, a reduced level of consciousness, attentional 
abnormalities, increased or decreased psychomotor activity, and a disordered 
sleep-wake cycle.” Acute confusion is a common occurrence among in- 
stitutionalized elderly persons (72, 92). 

Confusional states presumedly reflect disturbances in cerebral metabolism. 
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In elderly persons, confusional states can be caused by many physical or 
psychological alterations, including cerebrovascular accidents, seizures, in- 
fections, hypoxia, myocardial infarction, depression, and drugs (55, 57). 

Acute confusional state is a clinical entity that includes specific Diagnostic 
and Statistical Manual of Mental Disorders III criteria for the diagnosis (21). 
In this review, we expand the definition of delirium to all forms of cognitive 
impairment that result from multiple drug therapy, which is an important 
etiologic factor of cognitive impairment in the elderly (67). 

Larson and associates (52) found drug-induced dementia to be a cause of 
cognitive impairment in 11.6% of patients with suspected dementia. Further- 
more, the relative odds of drug-induced dementia increased from 1.0 in 
patients taking 0-1 drug to 9.3 in those taking 4-5 drugs, which suggests that 
polypharmacy is an important risk factor of cognitive impairment in the older 
person. 

We can define polypharmacy as the use of more than one drug to treat 
symptoms or disease. There are many examples of beneficial drug com- 
binations (15). Antimicrobial combinations have been used to broaden the 
spectrum of antimicrobial coverage and decrease toxicity by use of lower drug 
concentrations (13). Similarly, levodopa has been combined with carbidopa 
to increase the concentration of levodopa in the brain (9). 

The term polypharmacy has also been used to describe the situation in 
which drugs, usually in large numbers, are used to manage multiple symp- 
toms and diseases in an individual patient. We use this definition in our 
discussion. Polypharmacy is a particularly common problem among older 
persons (12, 81). In this review, we describe the problems of ACS that result 
from specific drugs, examine the evidence that points to confusional states 
with polypharmacy, and outline methods that can be employed to prevent the 
occurrence of the problem. 


EPIDEMIOLOGY 


Although ACS in the geriatric patient is recognized as a common problem by 
health care providers, epidemiologic data on the incidence of this syndrome, 
particularly in ambulatory elderly, are lacking. Most information concerning 
the frequency of this condition has been obtained from studies of in- 
stitutionalized elderly persons (92). Incidence estimates vary greatly because 
of different diagnostic criteria and methods employed by investigators and 
because of the different settings in which patients were studied (58). 
Studies of ACS among hospital admissions have found that 10-40% of 
elderly patients were acutely confused on admission. Estimates depend on the 
type of institution (psychiatric hospital, general medical ward, or geriatric 
unit) to which patients were admitted (58). Estimates of acute confusional 
disorder in elderly patients admitted to general medical wards have usually 
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been much lower. According to Hodkinson (40), about one fourth of geriatric 
patients in a multicenter study of 21 geriatric departments were acutely 
confused on admission. Bergmann & Eartham (7) found that 15% of geriatric 
patients admitted to a general medical ward suffered from ACS. Seymour et al 
(75) found that 16% of patients over age 70 who were admitted as emergen- 
cies to a general medical unit were acutely confused. 

In about 25-35% of geriatric patients with normal cognitive states on 
admission, ACS develops during the first month of their hospital stay (40). 
The elderly are at much greater risk of ACS than younger age groups. In one 
prospective study, 29.5% of patients over age 70 exhibited confusion within 
six weeks of their admission to general medical wards, compared with 3.6% 
of those under age 70 (27). Warshaw et al (89) found confusion and advanced 
age to be significantly correlated: 61% of female patients over age 84 were 
moderately to severely disoriented. Unfortunately, no information could be 
found on the incidence of ACS in the ambulatory elderly population. One 
would expect that most elderly persons in the ambulatory setting would be 
immediately brought to the attention of a physician if this condition developed 
and would be hospitalized. 


DRUG-RELATED ACUTE CONFUSIONAL STATES 


Numerous causes of ACS in the elderly have been identified. Most authorities 
believe that the condition results from multifactorial etiologies, rather than a 
single cause. The combination of these factors alters cerebral function and 
produces the syndrome (54). Because most clinical conditions thought to be 
associated with ACS, including cardiovascular, neurologic, and psychologic 
conditions, are commonly treated with drug therapy, it is difficult to separate 
the contribution of drugs to ACS. 

Larson et al (53) studied 200 consecutive patients over age 60 with sus- 
pected dementia and found 69.5% had dementia of the Alzheimer type. Drug 
toxicity was the most common treatable form of the suspected dementia and 
was present in ten (5.0%) of the patients. 

For physicians to attribute ACS to a drug or combination of drugs, they 
must have an index of suspicion that the drug could be responsible. Morrison 
& Katz (67) have concluded that mechanisms do not currently exist to 
evaluate appropriately the potential for specific drugs to cause cognitive 
deterioration. 


POLYPHARMACY AND THE ELDERLY 


Older adults in the US represent about 12% of the population, yet they 
consume 31% of all prescription medications and use a disproportionate share 
of nonprescription drugs (3, 24). Studies of drug use in patients who attended 
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a general medical clinic associated with a teaching hospital revealed that the 
average number of medications used per person was 3.2, and the number 
increased to 5.0 for patients over age 65 (81). Cross-sectional studies of 
elderly populations (conducted in 1982-1985) have shown that the elderly use 
an averge of 1.7 to 2.7 prescribed medications in addition to one nonprescrib- 
ed drug, and the number of drugs used increases with increasing age (36, 38). 
In 1977, patients in the US received 4.3 new and refill prescriptions, but that 
number rose to 10.7 for patients over age 65 (47). 

Many clinicians and researchers have questioned the wisdom and rationale 
of polypharmacy in the elderly (5, 49, 51, 61, 70). The use of multiple 
concurrent medications may predispose the older person to iatrogenic illness, 
including adverse drug reactions, drug-drug interactions, and decreased 
medication compliance (41) (Figure 1). Although most experts discourage the 
use of multiple medications to treat symptoms and disease in the elderly, 
investigations have shown that the number of medications used by ambulatory 
elderly persons has increased from 1978-1979 to 1987-1988 (83). Further- 
more, numerous factors will increase the likelihood of polypharmacy in the 
elderly in the future (80). New advances in diagnostic techniques will increase 
the number of detectable diseases in the elderly, and physicians may feel 
impelled to treat those conditions. New drugs will become available through 
traditional pharmacologic research and new biotechnological breakthroughs. 


Considering the prevalence of potentially treatable disease in the elderly, 
polypharmacy will become the rule, not the exception. 


COGNITIVE IMPAIRMENT RESULTING FROM 
MONOTHERAPY 


Ample evidence is available to show that certain drugs can impair cognitive 
function in specific situations (93). By using scientific methods, researchers 
have shown that anticholinergic agents and psychotropic drugs cause cogni- 
tive impairment (67). For many other classes of drugs, including anti- 
hypertensives, nonsteroidal anti-inflammatory drugs, and corticosteroids, 
documentation of cognitive impairment is less convincing (67). The majority 
of evidence relating cognitive impairment or ACS to drugs has been derived 
from individual case reports (67). 

Anticholinergic agents have repeatedly been demonstrated to alter memory 
of normal volunteers (14). By using a double-blind crossover study, Sunder- 
land and colleagues (84) have shown that patients with Alzheimer’s disease 
are more sensitive to cognitive impairment from scopolamine than normal, 
age-matched controls. The effect of low-dose intramuscular scopolamine on 
cognitive function in elderly medical inpatients has been investigated by 
Miller and associates (66). Patients were randomly assigned to receive in- 





POLYPHARMACY AND CONFUSION 419 


h bd 
oO UW 


W 
Ol 





ae) 
) Oo 


Adverse Drug Reactions 
nm Ww 
O oO 


Per cent of Patients with 


nao a 








2 4 6 8 10 12 14 16 18 20 22° 
Number of Drugs Administered 





Mor tality 
Rate 








‘enn: SD Oe, 


0-10 Tie 16° 
viatioar of Drugs Administered 

















Av. Hospital 
Stay (Days) 

















0-10 11-15 16° 
Number of Drugs Administered 


Figure 1 The relationship of rate of adverse drug reactions to number of drugs administered, 
mortality rate, and duration of hospitalization. (Reprinted with permission from Ref. 76.) 


tramuscularly 0.005 mg/Kg of either scopolamine or placebo two hours 
before surgery. Scopolamine produced mild cognitive impairment, which was 
observed on a Delirium Symptom Checklist and the Rey Auditory-Verbal 
Learning instruments. 

Trihexyphenidyl causes memory impairment in normal elderly volunteers. 
In one study, volunteers received a four-day course of treatment with either 
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trihexyphenidyl 4 mg bid or amantadine 100 mg bid. Subjects who received 
trihexyphenidyl complained of confusion and exhibited memory impairment 
on objective memory tests, but no memory impairment was noted when 
amantadine was administered (64). 

Because most tricyclic antidepressants have anticholinergic properties, 
their effect on cognitive state has been investigated (62). Cole and associates 
(19) reported that confusion or agitation developed in about 5% of elderly 
depressed patients receiving amitriptyline or imipramine. Amitriptyline signi- 
ficantly impaired cognitive performance in a placebo controlled study of 
elderly depressed outpatients with pretreatment evidence of mild cognitive 
impairment (11). Not all investigators have noted clinically significant confu- 
sion that resulted from the use of anticholinergics in the elderly. Seifert and 
colleagues (74) studied the use of drugs with anticholinergic effects in 29 
confused and 54 nonconfused elderly nursing home patients. No patient 
received higher than the equivalent recommended daily dose of atropine, 
when calculated in terms of relative anticholinergic potency. No statistically 
significant correlation was found between confusion and the amount of 
anticholinergic administered. The authors noted that medication with signifi- 
cant anticholinergic effects were often prescribed for patients who already had 
confusion and cognitive impairment. 

Cognitive impairment resulting from benzodiazepine administration has 
been extensively investigated by using well designed methodology. Impaired 
learning of verbal and visual information was demonstrated in both anxious 
and normal nonelderly volunteers (26, 73). Bond & Lader (10) have shown 
that the cognitive impairment associated with long-acting benzodiazepines 
persists for an extended period of time. 

The effect of chronic administration of benzodiazepines on cognitive func- 
tion in the elderly has received less study. Larson et al (52) found that in 35 of 
308 patients with suspected dementia, the condition resulted from chronic 
drug use. Dementia was attributed to the chronic administration of a single 
benzodiazepine in 13 patients in this group. However, the authors also found 
that the risk of dementia increased with the number of concurrent drugs 
administered. Other investigators have failed to find cognitive impairment 
with chronic benzodiazepine administration. Viukari and associates (88) per- 
formed a randomized, double-blind crossover study of the effects of 1 mg 
flunitrazepam and 5 mg nitrazepam on cognitive tests in hospitalized elderly 
psychiatric patients. They found little effect on performance. 

High blood pressure is a common disease in elderly persons, occurring in 
about 40% of the population over age 65. High blood pressure is a risk factor 
for cognitive impairment, primarily as a result of multi-infarct dementia, but 
antihypertensive drugs have also been implicated as a cause of cognitive 
dysfunction (1). In this case report, five patients complained of such symp- 
toms as decreased memory, inability to perform calculations, and reading 
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impairment. Solomon and colleagues (77) conducted a controlled study to 
demonstrate cognitive impairment in patients treated with methyldopa and 
propranolol. Croog and associates (33) also provided evidence that methyldo- 
pa could cause decreased visual-motor performance and lead to decreased 
quality of life. 

Hypotension, particularly the iatrogenic—induced syndrome, frequently has 
been cited as a cause of ACS in the elderly (28). Aggressive reduction of 
blood pressure in the elderly may cause decreased perfusion to the central 
nervous system and precipitate adverse effects on cognitive and behavioral 
function. Goldstein and colleagues (28) evaluated this phenomenon by per- 
forming a battery of psychometric tests on hypertensive patients over age 60. 
A group of patients were tested after successful blood pressure reduction with 
antihypertensive drugs and then compared with a placebo-treated control 
group. There was no difference in cognitive function, motor skills, memory, 
or affect between the two groups. Neither blood pressure reduction nor 
medication impaired cognitive function in this elderly population. 

Digoxin is one of the most frequently used medications by persons over age 
65 (36, 38). Closson (16) and Grubb (34) have shown that digoxin intoxica- 
tion can cause psychiatric side effects, including delirium and dementia. 
Tucker & Ng (85) demonstrated a significant correlation between plasma 
digoxin concentration and performance on the Buschke selective reminding 
test and the facial recognition test. There have been numerous case reports of 
cognitive impairment to many other drug classes, including nonsteroidal 
anti-inflammatory agents, meperidine, corticosteroids, and H-2 receptor an- 
tagonists (18, 22, 23, 29, 87). 

The studies cited above demonstrate that anticholinergic drugs, as well as 
several other drug groups, can produce cognitive impairment. Most effects of 
drug-induced cognitive impairment have been minor and only detectable with 
sensitive testing instruments. Several investigators have been unable to 
demonstrate gross cognitive impairment when using such screening tests as 
the Mini-Mental State Examination (4, 25, 82). 

Administration of drugs in most of the above-mentioned studies did not 
produce an actual acute confusional state. Other factors, including disease, 
nutrition, psychological stressors, and drug interactions probably play an 
important role in the acute confusional state of elderly persons. 


COGNITIVE IMPAIRMENT RESULTING FROM 
POLYPHARMACY 


In the cases described above, acute cognitive impairment was usually attrib- 
uted to a single drug. Because the elderly usually administer several drugs 


concurrently, cognitive impairment could result from the combined effect of 
drugs. 





422 STEWART & HALE 


Very few reports of ACS caused by polypharmacy are published in national 
refereed journals. Several examples of ACS resulting from polypharmacy 
have appeared in practice-oriented journals. Bressler (12) described a 75- 
year-old man who was drug-free until atrial fibrillation developed. The patient 
then received digoxin, warfarin, hydrochlorothiazide, and diazepam. Two 
weeks later, he was rehospitalized with complaints of visual disturbance, 
diffuse headache, lethargy, fatigue, impaired memory, and daytime somno- 
lence. On admission to the hospital, he had elevated concentrations of serum 
digoxin (2.8 ng/ml) and diazepam (2.4 mcg/ml; normal is 0.5—1.0 mcg/ml) 
and the prothrombin time was 28 seconds (normal 10-14). His lethargy, 
somnolence, and memory deficit were attributed to diazepam overdose, even 
though the medications were prescribed at usual adult doses. This case 
illustrates the pharmacokinetic and pharmacodynamic changes that place 
older persons at risk of adverse drug reactions. 

Gordon & Preiksaitis (30) described three elderly patients who suffered 
from acute confusion as a result of multiple medications. One patient, an 
80-year-old man, was brought to the geriatric care center for evaluation of a 
seven-month history of increasing confusion. His prescribed medications 
included carbidopa/levodopa (25/100 mg tid), enteric coated aspirin (650 mg 
bid), diltiazem (60 mg qid), allopurinol (300 mg daily), hydrochlorothiazide/ 
amiloride (every other day), digoxin (0.125 mg daily), nitropaste (qid), 
ranitidine (150 mg at bedtime), and diazepam (5 mg bid, as needed). Di- 
azepam, hydrochlorothiazide/amiloride, and allopurinol were discontinued, 
and dosage reduction was accomplished for aspirin, diltiazem, and digoxin. 
Within two weeks, there was a marked improvement in mental and physical 
functioning. The authors concluded that the patient suffered from 
polypharmacy with psychoactive and nonpsychoactive drugs. 

Other authors have described similar situations of multiple drug therapy 
that caused acute agitation and confusional states in older persons (65, 69, 
71). These authors have called for a more cautious approach to multiple drug 
use in the elderly. 

We have studied factors that correlate with cognitive decline in 1264 
elderly participants who attended a health screening program (82). Mini- 
Mental State Examination scores were used to identify risk factors for cogni- 
tive decline. Age, self-reported memory loss, and the presence of multiple 
disease states were the most important predictors of cognitive decline. The 
total number of drugs used was not an important predictor of cognitive 
function. Only one drug, dipyridamole, was associated with decreased scores 
on the Mini-Mental State Examination. 

Magaziner and colleagues (60) studied medication use and functional 
decline in 609 women, aged 65 or older, in 20 contiguous census tracts of 
Baltimore, Maryland. After controlling for age, education, physical health, 
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number of chronic conditions, and baseline functional status, prescription 
medication use was associated with a decline in ability to perform physical 
activities of daily living and an increase in symptoms of depression over a 
one-year period; however, no change was noted in cognitive functioning. The 
average number of medications reported was 3.45, and 10% of the women 
reported the use of five or more prescription medications. 


POLYPHARMACY AS A CAUSE OF ACS: BARRIERS OF 
ATTRIBUTION AND CAUSALITY 


Numerous factors have contributed to our lack of knowledge concerning the 
role of polypharmacy in producing ACS in the elderly. The most important 
factors are our present paradigm of adverse drug reaction reporting, lack of 
drug testing for cognitive impairment, and the nature of ACS. Current models 
for identifying and reporting drug-induced illness have discouraged reports of 
ACS caused by combination drug therapy. Most information on drug-induced 
disease has been derived from spontaneous reporting via the Food and Drug 
Administration’s (FDA) voluntary reporting program and from published 
literature reports in the letters and brief communication sections of medical 
and pharmaceutical journals (43). Many investigators have proposed models 
to assist clinicians in detecting adverse drug reactions and assessing causality 
or attribution to a drug or drugs (8, 42, 43, 46, 50, 68). These models and the 
FDA reporting system (1639 reporting form) focus on the identification of a 
specific drug, or sometimes a drug interaction, as the causative agent that 
produces an adverse effect. To be identified as the etiologic agent, a drug 
should usually be known to cause the reaction and its administration should be 
temporally related to the adverse event. Ideally, the reaction should resolve on 
discontinuance of the drug (dechallenge) and recur when the drug is adminis- 
tered (rechallenge) to the patient (43, 50). 

In the setting of an acute confusional state caused by the combination of ten 
different drugs, those criteria will not be fulfilled. An index of suspicion will 
probably not be raised to multiple drugs in combination as a cause of ACS. In 
most instances, physicians will not discontinue (dechallenge) all drugs, and if 
the patients’ confusional state improved after dechallenge, a rechallenge 
would rarely be attempted. Most journal editors would lack interest in articles 
that describe ACS when the causative agent cannot be more clearly delineated 
(43, 44). As a result, there is a paucity of published reports describing ACS 
caused by polypharmacy. Another factor contributing to our lack of knowl- 
edge in this area is that little research had been conducted to evaluate the 
effect of drug therapy on cognitive impairment in the elderly. For many years, 
geriatric specialists have called for the FDA to establish requirements for drug 
testing in the elderly (79), as the pharmacokinetics and pharmacodynamics of 
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many drugs differ in older pesons because of alterations in elimination and 
distribution. 

Because multiple drug therapy is commonplace, new drugs should be tested 
to determine their propensity to interact with drugs commonly prescribed for 
older persons. Unfortunately, the FDA took nearly a decade to establish 
guidelines for drug testing in the elderly (79, 86). The new regulations should 
lead to safer, more effective drug therapy, but the guidelines still do not 
provide for evaluation of drug-induced cognitive impairment. Existing mech- 
anisms do not ensure an appropriate evaluation of the potential for specific 
drugs, let alone drug combinations, to cause cognitive deterioration. 

The nature of ACS makes it very difficult to assess drug-induced causes, 
especially to combination drug therapy. The clinical setting of an elderly 
patient with ACS is likely to be quite unstable. A patient who has eight to ten 
drugs prescribed by a physician probably has several acute or chronic illnesses 
that could each be a cause of ACS. The patient is probably anxious and 
concerned about issues of hospitalization, institutionalization, separation 
from spouse or family members, and death. Therefore, the patient likely has a 
drug or drug combination, a medical condition, and a psychologic state that 
could precipitate ACS. In this setting, nutrition and hydration usually are 
compromised as well. An ambulatory patient in this difficult state is also less 
able to follow medication administration instructions correctly, and medica- 
tion errors may further complicate the situation. 

When an ACS is recognized, several factors typically change rapidly. A 
patient with ACS quickly gains the attention of physicians and is admitted to a 
hospital. Fluid and electrolyte balance are assessed, and treatment is im- 
mediately started. Numerous drugs in the patient’s regimen may be discontin- 
ued, while new drugs may be administered; health care workers ensure that 
drugs and dosages are administered correctly. The patient immediately re- 
ceives increased attention, which may provide comfort and allay anxiety. 
Nutritional status probably improves, and better care is provided for the 
patient’s acute and chronic illnesses. In 24-72 hours, the patient’s ACS may 
improve or completely resolve. One then must determine whether the 
patient’s improvement resulted from increased attention, changes in hydration 
and electrolyte balance, drug therapy, nutrition, or a combination of these 
factors. The settings of ACS clearly make attribution of drugs a difficult task. 


POLYPHARMACY: AN UNCONTROLLED EXPERIMENT 


It is impossible to predict the outcome of combining eight to ten different 
medications in an elderly patient. A scientist working in the laboratory would 
never add ten different chemicals at random to a test tube without first 
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preparing for possible explosive consequences. Yet, ten, and sometimes 
more, medications are often given almost randomly to elderly persons without 
a clear understanding of possible untoward consequences. The ease of 
rationalizing expected therapeutic benefit from prescribing multiple drugs 
contrasts with our ability to predict deleterious side effects (59). 

The interaction of two drugs administered to a patient has been a major area 
of interest for many scientists for nearly two decades (37). Even with careful- 
ly controlled experiments, it has been difficult to identify interacting proper- 
ties when two drugs are given together. We are not aware of any attempts to 
conduct a scientifically designed study to determine the pharmacologic effects 
of combining five to ten drugs in the same patient. Loewe (59) attempted to 
apply scientific principles to the study of interactions between drugs more 
than two decades ago. However, he was never able to devise a satisfactory 
method for the simultaneous study of the effect of more than two drugs in 
combination. Therefore, one might consider it an uncontrolled experiment 
each time multiple drugs are prescribed for an older person. 

The multiple pharmacologic effects possessed by a drug make it difficult to 
predict results of combination therapy. Few drugs have precise and narrow 
ranges of action. Thiazide diuretics, for example, produce many different 
effects, including decreased body water, sodium, chloride, and potassium and 
increased serum glucose, calcium, uric acid, cholesterol, and triglycerides 
(90). Any drug that has effects similar or opposite to these actions may 
interact with thiazide diuretics, thus producing additive or antagonistic 
effects. The difficulty of accurately predicting the effects of combining ten 
drugs can be appreciated. 

There is ample evidence to demonstrate the hazards of multiple drug 
therapy. Smith et al (76) found an adverse reaction rate of 40% in patients 
given 16-20 drugs, compared with a rate of 7% in those given 6-10 drugs. 
May et al (63) studied the effects of multiple drug administration on adverse 
drug reactions in 10,518 patients who were hospitalized on a general medical 
service. They found a higher risk of adverse drug reactions for patients 
who received multiple drugs and speculated that the increased risk may result 
from drug interactions. Generally, the incidence of adverse drug reactions in- 
creases with the number of drugs administered (6, 17, 20, 31, 35, 39, 42, 
48, 78). 

It seems logical that if one drug is beneficial in treating one disease, then 
several drugs could benefit an elderly person with multiple diseases. Un- 
fortunately, we have not reached the level of sophistication in pharmacology 
and therapeutics to test the above hypothesis. Until methods are developed 
to examine pharmacologic effects of multiple drug therapy, physicians 
caring for older persons must be continually vigilant for ACS from poly- 
pharmacy. 
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FUTURE STRATEGIES TO PREVENT POLYPHARMACY 
IN THE ELDERLY 


The problems identified above clearly indicate that numerous factors contrib- 
ute to polypharmacy in older persons. Currently, we do not know whether 
multiple drug therapy in the elderly is optimal or if it has a beneficial or 
detrimental effect on the quality of life. Therefore, this issue requires a 
concerted effort on the part of political leaders, educational institutions, 
governmental agencies, and the pharmaceutical industry to develop future 
strategies to deal with the concerns of polypharmacy. 


Educational Strategies 


A major effort is needed to develop educational programs about geriatric drug 
therapy for health professionals and consumers. Although the greying of the 
population has been predicted for many years, educational programs have not 
been implemented to accommodate these changes. Most physicians and other 
health care practitioners caring for the elderly have not had the benefit of 
geriatric education. Education planners, political leaders, government agen- 
cies, and private foundations should join forces to develop geriatric education 
programs that include rational use of medication in the curriculum of all 
health care providers. 

A broad system of continuing education for current practitioners is needed 
to upgrade their knowledge base on drug therapy. Simultaneously, education- 
al programs should be provided to elderly consumers concerning the risks and 
benefits of drug therapy and the techniques they can employ to make drug 
therapy safer, more effective, and less costly. 


Drug Utilization Review 


Sophisticated computer software will be needed to monitor therapy of the 
elderly prospectively and screen patients’ therapy for inappropriate drug 
combinations. Computers could alert the pharmacists and prescribing physi- 
cians of multiple drug therapy in time to take corrective action. 

The Medicare Catastrophic Care Act of 1988 included a provision that 
would have required participating pharmacies to maintain pharmacy records 
for all drugs dispensed and to offer to counsel patients on appropriate use of a 
dispensed drug and on potential interactions between drugs (32). Participating 
pharmacies also had to agree to use the electronic point-of-sale claim process- 
ing system through a computer terminal in the pharmacy. Although the 
legislation has been repealed, similar legislation has been enacted for Medic- 
aid (2). 

In our opinion, computerized records of prescriptions received by patients 
would be an important step to prevent problems of polypharmacy in the 
elderly. 





POLYPHARMACY AND CONFUSION 427 


Research 


Research efforts must be intensified to document the problems of multiple 
drug therapy and ACS in the elderly. Many physicians can relate personal 
experiences in which ACS developed in an elderly person as a result of 
polypharmacy, yet there is a paucity of these reports in the medical and 
pharmacy literature. Health professionals should be encouraged to report 
these instances and document the public health importance of the problem. 

Research is also needed to determine factors that predispose older persons 
to the problem of ACS and to identify the drug or drug combinations most 
likely to cause the problem. 


SUMMARY 


Cognitive impairment resulting from drug therapy in older persons has been 
well documented for numerous classes of drugs. Unfortunately, the problem 
of ACS caused by polypharmacy is rarely reported in the medical literature, 
although we believe that it occurs frequently. Health professionals need more 
education concerning the risks of drug therapy in older persons and methods 
of reducing the use of multiple drug therapy. Finally, more research is needed 
to identify patient and drug factors that lead to drug-induced ACS and 
cognitive decline in the elderly. 
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INTRODUCTION 


Of all the age-related syndromes, perhaps none is more strongly associated 
with aging than dementia. Dementia is defined as a syndrome of global loss of 
cognitive function, especially memory, sufficient to impair social or occupa- 
tional function. Before the mid-1970s, dementia was considered a natural, 
indeed normal, consequence of aging. Alzheimer’s disease (AD) was primari- 
ly considered a cause of presenile dementia, whereas so-called “senile de- 
mentia” (age 65+) was largely ignored by both the public and medical 
practitioners. Now we know that AD affects adults of all ages, but only rarely 
those under age 60. The prevalence increases dramatically for each age group 
over age 65, and AD is the most common condition to cause disability in “old 
old” persons over age 85. Concomitant with the aging of most advanced 
societies, the past decade has seen increasing awareness that dementia is a 
problem of immense importance to public health (1). 


THE SYNDROME OF DEMENTIA 


As a Clinical term, dementia describes a syndrome of generalized mental 
deterioration that causes functional impairment (2). The most widely accepted 
formal definition of dementia is from the third edition of Diagnostic & 
Statistical Manual of the American Psychiatric Association (DSM-III) (3): 


431 
0163-7525/92/0501-0431$02.00 





432 LARSON, KUKULL & KATZMAN 


Dementia consists of impairment in short- and long-term memory, along with 
disturbances of other cognitive functions (e.g. abstract thinking, judgment, 
language, recognition) and/or personality change. The disturbance must be 
sufficient to interfere significantly with work, usual social activities, or 
relationships with others. Dementia is explicitly distinguished from an acute 
or short-lived episode of delirium and from focal loss of function, like isolated 
language problems (aphasia) or circumscribed memory loss (amnesia). 

As individuals age beyond 65, decline is more variable. Furthermore, 
although environment and educational achievement clearly affect cognitive 
function at all ages, these effects are probably more important in older age 
groups. Because most dementing illnesses begin insidiously, the distinction 
between so-called “normal” age-related functional decline and early or mild 
dementia can be difficult until more information is known about the rate of 
decline and associated functional problems (2, 4, 5, 7). One key distinguish- 
ing feature is that age-related decline does not usually cause significant 
impairment of function (7). These persons are able to compensate and func- 
tion independently. Also, the pace of decline due to dementing illness is 
greater (6), based on serial, layperson observations of general function and 
repeated neuropsychological measures of cognitive function (8, 9), usually 
after one year or at most two years of observation. 

Underdetection or underrecognition of dementia by 20% or more has been 
demonstrated repeatedly in community-based studies and in hospital settings 
(10). The extent varies in different settings and depends on where one places 
the boundary between normal and abnormal age-related decline and the 
willingness of providers to examine mental function. 

The dysfunction of dementia makes persons vulnerable in many ways (1). 
Demented persons are at risk of socioeconomic victimization and abuse. If 
they are isolated, they frequently suffer because of an inability to handle 
personal business and household affairs. They are at high risk of personal 
injuries from falls and accidents and can injure others if they operate complex 
machines, like automobiles. Finally, there are clear “medical” risks of de- 
mentia (11). Patients may not be able to take medication reliably or safely 
(12). They often cannot report problems to family or caregivers; as a result, 
medical problems are not detected until they are relatively far advanced (7). 
There are many causes of dementia; some causes, which can be effectively 
and easily treated, may lead to irreversible damage if not cared for promptly 
(11, 13). For these reasons, we conclude that underrecognition of dementia, 
especially in older persons, is a potentially serious problem. 


CAUSES OF DEMENTIA 


Like other common syndromes in medicine, dementia may be caused by 
many illnesses, which may present singly or in various combinations. It is a 
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serious mistake to equate dementia with AD, even though in most countries it 
is probably the most common cause of dementia (1, 15). 

The distribution of illnesses that cause dementia has been the subject of 
much clinical research in recent years (13, 14). We know that in the US and 
Canada, AD is the most common cause (15). However, the distribution of 
Alzheimer’s and other causes, especially multiinfarct dementia, the second 
most common cause, varies depending on the nature of the underlying 
population (13, 14). In predominantly Caucasian, middle-class, community- 
based populations, AD may be the cause of dementia in up to 90% or more 
cases (16). By contrast, in populations with larger numbers of blacks, es- 
pecially with a high prevalence of untreated high blood pressure or diabetes, 
the proportion of multiinfarct dementia will be relatively greater, although 
AD is usually still more common (13, 14). 

The Canadian Consensus Conference on Assessment of Dementia evalu- 
ated the overall prevalence of conditions that cause dementia (17). After AD 
and multiinfarct dementia, conditions judged to have moderate prevalence 
were Parkinson’s disease, alcohol abuse, drug toxicity, and depression 
(though not an isolated cause of dementia). A long list of other causes 
includes chronic degenerative diseases of the nervous system (Pick’s disease, 
Huntington disease, progressive supranuclear palsy), postanoxic or posttrau- 
matic brain damage, brain tumor, normal pressure hydrocephalus, certain 
metabolic disorders (hypo- or hyperthyroidism, hyponatremia, hypercalce- 
mia, hypoglycemia, renal failure, B12 deficiency, and other nutritional de- 
ficiencies), certain central nervous system infections [human im- 
munodeficiency virus (HIV), neurosyphilis, and Creutzfeld-Jakob disease}, 
neurotoxins (aluminum, mercury, and aromatic hydrocarbons), and the re- 
mote effects of carcinoma on the central nervous system (2, 17). 

Some causes of dementia are more likely to contribute to the severity of 
dementia and patient dysfunction than they are to be the sole cause of 
dementia. Drug toxicity is a good example (12). Hearing loss, depression, 
and acute or subacute infections, like urinary tract infection, do not them- 
selves cause dementia, but can make dementia worse and are called “sources 
of excess disability” (18, 19). 

The heterogeneity of dementias is important for clinicians, investigators, 
and policy makers. Clinicians need to perform careful evaluations of patients 
to determine the causes and plan appropriate treatment and follow-up (11). 

Research involving patients with AD can only begin after a careful diagnos- 
tic evaluation, which consists of a general history, physical, neurological, and 
neuropsychological evaluation, and laboratory investigation, to look for other 
causes of dementia. At present, there is no specific diagnostic test for 
Alzheimer’s (20); the clinical diagnosis is probabilistic, with optimal sensitiv- 
ity and specificity about 85-90% (21, 22). 

For health planning, AD has a median survival of eight to ten years after 
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onset (range 1-20 years) (23). Survival of multiinfarct dementia patients is 
usually shorter, although also highly variable and likely to be improved by 
control of risk factors for strokes and use of antiplatelet drugs, like aspirin or 
ticlopidine. 


POPULATION AND AGE-SPECIFIC DISEASE RATES 


Prevalence of Alzheimer’s Disease and Dementia 


There are two current approaches to estimate the prevalence of dementia and 
AD. The first is characterized by population sampling, a screening examina- 
tion, diagnosis of the screen positives, and usually examination of a sample of 
the screen negatives to estimate the false-negative rate of the screening 
battery. The second approach identifies cases from well-documented medical 
records. 

In European countries, the prevalence of dementia was 1-6% in persons 
aged 65 years or more, either based on National Health Service reports for 
primary identification of cases, or by using a screening instrument for cogni- 
tive impairment followed by a physician’s examination to define the case 
(24-31). 

From a sample of 6634 community-dwelling persons aged 55 or greater in 
Shanghai, China, Zhang et al (32) screened 5055 for cognitive impairment by 
using the Mini-Mental State Examination (33). Different cutoff scores were 
used depending on level of education. Screening was empirically estimated to 
have a sensitivity of 85% and a specificity of 93%. All screen positives, plus a 
5% random sample of screen negatives, were subjected to a differential 
diagnosis of dementia. The prevalence of dementia was 4.6% among those 
persons aged 65 and older and 24.3% among those aged 85 and older; rates 
similar to those derived in the European studies are cited above (24-31). 
Approximately 65% of demented persons were classified as having AD. 
Female sex and low educational level were strongly related to dementia. 

Pfeffer et al (34) used a similar design to estimate the prevalence of 
dementia in a retirement community. Persons scoring in the abnormal range 
on screening, plus a 14% sample of those who scored in the normal range, 
were referred for diagnosis. The screening battery included the Mini-Mental 
State Exam (33), a structured interview for demographics and past medical 
history, and an assessment of depression. Persons continuing to the next 
phase were examined by a neurologist, whose findings were reviewed blindly 
by another neurologist, followed by a consensus conference. Dementia of 
Alzheimer’s type occurred in 15.3% of the community surveyed, including 
35.8% of the 80+ age group. By excluding cases classified as questionable, 
the estimate decreased to 6.2% of those 65+ and 15.8% of those over age 80. 
Of course, the retirement community may have been enriched with cognitive- 
ly impaired persons. 
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A widely cited study (16), which used the survey/screening approach, 
began with a census of persons over age 65 in East Boston, of whom 3623 
(81%) received the initial memory testing. The sample receiving the full 
diagnostic examination for dementia included 52% (196) of 378 with poor 
memory, 9% (101) of 1108 with intermediate memory, and 8% (170) of 2137 
with good memory. Sampling after screening, rather than before, was an 
innovation from the design used by Pfeffer et al (34). Standardized criteria 
(35, 36) were applied to the dementia diagnosis, but determining whether the 
observed intellectual deficit was “sufficient to interfere with social and occu- 
pational functioning” (a DSM-III criterion) was difficult in their survey, 
because the expected social and occupational functioning was closely tied to 
the social role expectations. Instead of measuring dysfunction, the in- 
vestigators imputed from the examination and test scores the size of a deficit 
that would be expected to affect function. An accompanying editorial (37) 
noted that this procedure may overestimate prevalence by labeling persons ill 
who appear to fulfill their roles adequately. 

The prevalence of AD in East Boston (16) was 10.3% in persons aged 65+, 
compared with 6.2% (or 15.3%) of the US retirement community (34), 
and 3.1% in Shanghai (32). For the over-85 age group in East Boston, 
the prevalence of AD was estimated to be 47%. Evans (15) estimates 
that 7.5—14.3 million persons in the US will have AD by the year 2050, of 
whom 4.7—10.5 million would be aged 85 of more, based on East Bos- 
ton rates. 

The three studies cited above went into the community to find cases. 
Kokmen et al (38) used the database of the Mayo Clinic and affiliated 
hospitals to determine the prevalence of dementia and AD in Rochester, 
Minnesota, by using a list of 26 diagnoses that possibly lead to dementia. 
Records of all persons alive on prevalence day (January 1, 1975) were 
searched for the existence of any of the diagnoses between 1959 and 1982, 
which then led to a detailed medical record review and evaluation of the 
person’s dementia status according to established criteria (35, 36). For per- 
sons aged at least 65 years, the age and sex adjusted prevalence of demen- 
tia was approximately 3.5%; the prevalence of AD was 2.4% (12.6% for 
the 85+ age group). These rates are similar to those reported for Shanghai 
(32). The stability of the population-base in Rochester and the depth of the 
medical database available for computerized search make this type of study 
possible. However, dementia could go unrecognized either because one of 
the potential diagnoses was not used in the medical record, or because suf- 
ficient information was not available on review to make a diagnosis based 
on the criteria. We conclude that the results in Rochester and East Boston 
probably constitute lower and upper bounds for US estimates of prevalence 
of dementia and AD. In all studies, prevalence rates increase dramatically 
with age. 
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Incidence 


The incidence of “medium and severe age psychosis” and organic brain 
syndrome per 100 person-years for the population over age 60 in Lundby, 
Sweden, during 25 years of observation (1947-1972) was calculated to be 1.6 
for 1947-1957 and 1.1 for 1957-1972 (39). (For consistency, rates are 
reported here in person-years, even though some were originally estimated as 
cumulative incidence per annual population.) Incidence increased sharply 
with age: 5.8 and 3.3 per 100 person-years for the two time periods for the 
80-89 age group (39). 

Treves et al (40) conducted a study with the Israeli national neurologic 
disease register and a review of medical records of potential cases. For the 
data when “the first change in mental function consistent with a diagnosis of 
dementia” was noticed, the incidence of presenile dementia of the Alzheim- 
er’s type (ages 40-60) was found to be 0.002 per 100 person-years (2.4 per 
100,000 population). Rates of disease appeared to increase with age in the 
same manner as dementia rates among older age groups. Thus, the authors 
inferred that presenile and “senile” disease may be part of the same con- 
tinuum, rather than two separate diseases (40). 

A longitudinal follow-up study (41) from the Bronx, New York, involved 
434 nondemented persons aged 75 to 85, who were followed over a five-year 
period as part of a volunteer cohort. All subjects received a standardized 
examination at entry into the study and were evaluated annually with a 
cognitive test battery. If the test battery indicated abnormal decline, subjects 
were referred for a detailed dementia work-up. The cohort was 90% white, 
64.5% female, and 70% Jewish. During the study, new cases of dementia 
were diagnosed, which led to an incidence rate of 3.5 per 100 person years, a 
result similar to the Lundby study (39). The incidence rate for Alzheimer’s 
was 2.0 per 100 person years. Interestingly, higher scores (fewer errors) on 
the initial mental status test were associated with lower risk of subsequent 
(new) AD. 

Studies utilizing the Mayo Clinic database and the population of Rochester, 
Olmsted County, Minnesota, involved cases with onset in 1960-1964 (42) 
and for three consecutive five-year periods ending in 1974 (43). Based on 178 
dementia cases, the age-specific rates for AD increased from 0.1 per 100 
person-years in the 60—69 age group to 0.5 in the 70-79 age group and to 1.4 
per 100 person-years in the over 80 age group; these results are close to those 
for the 75—85-year-old cohort in the Bronx. The second study (43) confirmed 
the pattern of age-specific rates and concluded that the incidence of AD did 
not change during the 15-year time period. The 20-year follow-up through 
medical records allowed time for the diagnosis of AD to be substantiated, a 
feature unavailable to conventional cohort studies. Schoenberg (44) cautions 
that even the Rochester rates may be underestimates, because persons still 
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may have not reached medical care or may have been unrecognized by 
providers. [However, these studies (41-43) serve as a benchmark for the 
incidence of AD in the US.] 


EPIDEMIOLOGY OF ALZHEIMER’S DISEASE 


Analytic epidemiologic studies have tested a wide variety of conditions and 
exposures for association with the onset of AD (see 44-47). In addition, a 
consortium of original investigators has analyzed pooled data from the major 
epidemiologic case control studies to arrive at potentially more stable es- 
timates of risk, especially for those studies in which the numbers of subjects 
were small or the power low (48). Here, we summarize findings and describe 
how possible sources of bias might affect the strength of association with 
putative risk factors. 


Family History 


Estimates of increased risk to family members of AD patients are the most 
consistent of any factor investigated, except age (46). The estimates range 
from 1.0 (49) to 7.7 (50), with a crude mean of about 4.0 among seven case 
control studies (49-55). 

One possibility for bias lies in the ascertainment of cases and controls and 
how the family members were diagnosed as AD. Because nondifferential 
misclassification of exposure (in this case family history) should reduce the 
risk estimate toward the null, any elevated (or decreased) observed risk is 
likely to be an underestimate of the true risk. If the misclassification is 
differential, rather than nondifferential, then the resulting risk estimate could 
be biased in either direction (56). The degree to which cases are self-selected 
to attend a referral-level specialty clinic for the diagnosis of AD arguably may 
be related to the family’s previous experience with the disease. If the disease 
has occurred in other relatives, the patient or patient’s caregiver may be more 
likely to recognize the onset of the disease and seek help. Because referral or 
specialty centers are the most commonly used source of cases for research 
studies, the occurrence of AD in relatives of cases that enter studies could be 
spuriously high. Another possible source of bias, which has not been studied, 
is a recall bias: Case informants may be more likely to recall dementia in other 
relatives than control informants. This is plausible, as informants’ close 
association with the case may increase their awareness of dementia symptoms 
in more distant relatives, whereas control informants may be more likely to 
consider mild to moderate symptoms of dementia in other relatives just 
normal aging. 

The definition of “family history” varies across studies; most often it is, 
appropriately, the occurrence of the AD in at least one other first-degree 
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relative (parent, sibling, or child). Some studies loosely include any other 
family member who either has or had AD. Obtaining an accurate diagnosis of 
the affected relative based on the report of an informant relative is a signifi- 
cant problem (57); autopsy findings or medical records should be sought, but 
are often unavailable. Further problems arise from failure to consider the size 
of the family, the age of living family members, the age at death and cause of 
death of deceased family members, the age at onset of AD, or an estimate of 
the age-specific incidence of the disease (58, 59). Hughes (1991, in prepara- 
tion) has developed a “familial risk” score for each family, which utilizes the 
items mentioned above. The classification of cases as familial or sporadic, 
therefore, becomes a stochastic process, by which some individuals have high 
probability of being true familial cases and others have a high probability of 
being true sporadic (nonfamilial) cases. The remainder, which is likely to be a 
substantial proportion, would have some intermediate probability attached to 
them and would necessarily remain unclassified. 

Genetic linkage studies seek large pedigrees with many affected members 
to search for genetic markers. A family-history-positive pedigree that in- 
cluded a proband and one other relative would be of little value for genetic 
linkage studies. Large pedigrees with multiple affecteds are quite rare, and 
genetic heterogeneity appears to exist (63, 64). The relative importance of 
genetics in the etiology of AD is still a key question, but it is unlikely that 
crude measurements, such as “family history,” identify homogeneous disease 
subgroups that could provide new information about genetic or environmental 
risk factors. 


Head Trauma 


Prior head trauma as a potential risk factor for AD has been evaluated in many 
of the current case-control studies (42-52, 55, 65-69). The rationale for the 
possible association of head injury and AD is based on the occurrence of 
dementia pugilistica (DP), the punch-drunk syndrome seen in boxers, in 
which continuous insult ultimately leads to neuron loss. Until recently, the 
pathological findings of DP and AD were thought to be distinct; the former 
exhibited an excess of neurofibrillary tangles and no amyloid plaques, the 
latter exhibited both plaques and tangles. Roberts et al (70) immunocyto- 
chemically demonstrated diffuse deposits of B-amyloid that were not evident 
previously with conventional staining methods. Thus, at least based on this 
molecular marker, DP and AD may share common pathogenetic mechanisms 
that lead to tangle and plaque formation. 

Most studies have found that head trauma (with recovery) before the onset 
of AD occurs more frequently in AD cases than in controls. Point estimates of 
the odds ratio, which average about 2.9, vary from 0.6 (66) to 6.0 (49); 
however, the 95% confidence intervals for the odds ratios exclude 1.0 in only 
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three of nine case control studies (50, 65, 67). History of head injury is 
usually obtained from surrogate respondents, who may be more likely to 
recall previous head injury as they seek explanations for the onset of AD. 

Chandra et al (68) utilized the medical record system of the Mayo Clinic to 
identify all persons with head injuries and loss of consciousness who came to 
medical attention in a cohort of incident AD cases and matched control 
subjects. Of the 274 pairs, nine were discordant for head injury with loss of 
consciousness: five in which the case was injured, and four in which the 
control was injured. Because of the relative rarity of head injuries among the 
controls, the power to detect a true difference was diminished, even though 
the overall sample size was large. The authors concluded that, because their 
study found no difference and had no recall bias, head trauma with loss of 
consciousness was unlikely to be related to the onset of AD. They did not 
assess whether minor head injuries differed between cases and controls; this 
aspect may have been the basis of the findings of previous studies. Katzman et 
al (41) followed a cohort of 434 persons aged 75-85 for five years to 
determine the incidence of dementia and AD. No difference in the occurrence 
of prior head trauma was found for those who became cases. Recall bias was 
also minimized in this study, as head trauma data were collected before the 
clinical onset of disease. 

The following are key questions that surround the evaluation of head 
trauma as a risk factor for AD: 


. What dose, in terms of frequency or severity, increases the risk of AD? 

. What proportion of AD is associated with head trauma? Is the AD associ- 
ated with head trauma a specific subtype? 

. What is the optimum time at-risk between head injury and onset of AD? 

. Will seeking to prevent head injury have a measurable impact on public 
health by reducing the incidence of AD? 

. Is there a valid biologic marker for exposure to head injury? 


If head trauma is a risk factor for AD, answers to these questions could lead to 
intervention strategies aimed at the prevention of head injury and, thus, AD. 


Aluminum Exposure 


Aluminum compounds have been found in the neurofibrillary tangles and the 
cores of plaques that occur in the brains of AD patients (71). Jacobs et al (72), 
however, found no difference in the levels of aluminum in AD and control 
brains. Aluminum is neurotoxic and has been implicated as a possible cause 
of encephalopathies related to dialysis (71, 73, 74), which have different 
clinical and pathologic characteristics from AD. But, aluminum is also very 
common, and daily intake, primarily from foodstuffs, may exceed 20 mg. 
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Attempts to monitor tissue concentrations of aluminum in serum, blood, hair, 
and urine have shown that these concentrations were not associated with 
dietary intake or age, nor were they stable over time (75). 

Martyn et al (76) recently reported an increased risk of AD for persons who 
reside in water districts in which the aluminum content was 0.11 mg/l or 
greater, compared with 0.01 mg/l, associated with an estimated relative risk 
1.5 (95% confidence interval 1.1—-2.2). Alzheimer’s cases included in the 
study were only those aged less than 70 years who had attended a cranial 
computed tomography clinic. No dose-response trend was evident. 

A recent case-control study (77) of AD reported an association between 
aluminum-containing antiperspirants and onset of AD with an odds ratio of 
1.6 (1.0, 2.4). A dose-response trend was observed for frequency of use. 
Paradoxically, the odds ratio for use of aluminum-containing antacids was 0.7 
(0.3, 2.0), which possibly reflects differences in bioavailability of aluminum 
administered in different forms. 

Among 1353 Canadian gold and uranium miners (78) who inhaled alumi- 
num powder (McIntyre powder) before entering the mine, as prophylaxis 
against silicotic lung disease, neuropsychological testing classified signifi- 
cantly more exposed miners than nonexposed as having cognitive impair- 
ment, with relative risk of 2.6. The proportion of impaired miners rose with 
the years of exposure (78), but there was no greater occurrence of neurolog- 
ical problems or AD. 

To carry the study of aluminum in the etiology of AD beyond the stage of 
controversy and conjecture, the biochemistry of aluminum and the identifica- 
tion of the particular forms specific to pathogenesis of AD must be described 
in detail. If AD and aluminum are linked, prevention will require methods for 
determining exposure and its temporal relation to the onset of disease. 


Viruses 


Infectious particles, e.g. prions, can cause spongiform encephalopathies, 
such as Creutzfeld-Jakob disease, scrapie, Kuru, and Gerstmann-Strausslar 
syndrome (79-81). Manuelidis et al (82) attempted to transmit AD to ham- 
sters injected with buffy coat from AD patients. Their results were in- 
conclusive; the affected hamsters developed spongiform changes and not AD 
pathology, which raises the possibility of contamination with spongiform 
agents. 

Postmortem AD brain tissue has been tested for herpes viral DNA, with 
negative results (83, 84). Comparison of patients with AD, Down syndrome, 
and other causes of dementia with age-matched normal control subjects for 
cross-reactive antibodies to HIV type 1, caprine arthritis encephalitis virus, 
and equine infectious anemia virus revealed no cross reactive antibodies in 
cerebrospinal fluid or serum (85). Renvoize et al (86) found no difference in 
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titer for Adenovirus, Chlamydia Group B, Coxiella burnettii, Cytomegalovi- 
rus, Herpes simplex virus, Influenza A, Influenza B, Measles, and Mycoplas- 
ma pneumoniae in 33 clinical AD patients and 28 psychiatric patients (86). 

If AD is an infectious process, it is more likely to result from an un- 
conventional virus or infectious protein than from more common viruses (80, 
81, 86). However, detection techniques used to date (86) may not be suf- 
ficiently sensitive, and/or the tissues tested may not reflect the genesis of the 
disease. 


Other Factors 


Equivocal or unsubstantiated increased risk estimates have been found for 
maternal age, Down syndrome in relatives, thyroid disease, smoking, and 
sedentary lifestyle (44-47). Kokmen et al (87) have recently reported no 
increased risk for therapeutic radiation. Many other possible factors have been 
evaluated in the reported case-control studies without significant result. 

Because molecular genetics has been proceeding rapidly in the study of AD 
(61-64), and because genetic heterogeneity is likely, the possibility of in- 
teractions between genotype and external risk factors must be considered in 
future epidemiologic studies. Ottman (88) recently outlined five generic ways 
in which the effects of risk factors and genotypes could combine. 


BIOLOGY OF ALZHEIMER’S DISEASE 


The biology of AD has been dominated for almost a century by the remark- 
able morphological changes, which were described in 1907 by Alois 
Alzheimer, in the brain of a patient who died at age 55 with a progressive 
dementia that involved memory, language, and behavioral changes (89): 
atrophy of the brain, especially involving the neocortex; the presence of 
abnormally staining neurons, the neurofibrillary tangles; and the presence of 
numerous neuritic plaques, focal collections of degenerating nerve terminals 
that surround a core of an abnormal fibrillar protein, B-amyloid (90, 91). The 
accelerating pace of AD research during the past several years has led to a 
series of remarkable findings, including quantitative analysis of the cellular 
changes, clinical-pathological correlations, and initial identification of the 
molecular aspects of these abnormal structures. These advances have made it 
possible to begin forming a clearer picture of AD as a chronic disorder in 
terms of the changes in the brain that characterize it. 

One fundamental question is now being answered: “What causes dementia 
in Alzheimer’s disease?” Cognitive impairment, measured by mental status 
testing, is caused by the loss of synapses (and neurons), particularly in frontal 
and parietal association neocortex. Important losses of nerve cells and 
synapses occur in other regions of the brain, such as the hippocampus, which 
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is especially involved in new learning, and the amygdala, hypothalamus, and 
olfactory cortex, which are involved in emotional and behavioral changes. 
There is also major involvement of the basal forebrain cholinergic nuclei, the 
locus coeruleus noradrenergic nuclei, and dorsal raphe serotonergic nuclei. 
These are all systems that project primarily to neocortex and hippocampus, 
thus contributing their terminals as the presynaptic component to neocortical 
and hippocampal synapses. 

Loss of synapses has been shown electron microscopically (92) and histo- 
logically (93, 94). Masliah et al (94) used antisynaptophysin, an antibody to 
one of the proteins that coat synaptic vesicles, to determine the density of 
synapses in association neocortex. The density of synapses measured with this 
antibody correlated extraordinarily well (r > 0.7) with measures of cognitive 
functions (95), including the Mini-Mental State Exam, the Information Mem- 
ory Concentration test, and the Mattis Dementia Rating Scale, which were 
carried out during the year before death (95). These highly significant correla- 
tions were obtained on a cohort of AD patients that did not include cognitively 
normal subjects. By using a stepwise linear regression, approximately 90% of 
the variance in cognition during the last year of life could be accounted for by 
a model that primarily used the midfrontal and inferior parietal synaptic 
densities. 

If the dementia is due to loss of synapses in neocortex, the next question is, 
“What causes this loss?” What is the sequence of events that leads to the 
abnormal changes in nerve cell bodies (neurofibrillary tangles), the develop- 
ment of neuritic plaques with their B-amyloid core, and the loss of synapses? 
Although the answer is unknown, the earliest stages in the pathogenetic 
sequence, as well as some aspects of later stages, are beginning to be 
understood. 

One major hypothesis of pathogenesis concerns the role of the B-amyloid. 
This protein fragment has been sequenced (96), and the gene for the amyloid 
precursor protein (APP) has been mapped to chromosome 21 (97). The gene 
itself has now been sequenced (98); it codes for an interesting series of 
isoforms of APP, which differ by the presence or absence of a serine protease 
inhibitor insert. Amyloid precursor protein is a molecule produced normally 
by cells, including neurons, and is probably required for cell viability. The 
abnormal breakdown product, B-amyloid, accumulates in the core of the 
neuritic plaque, and a fragment of this polypeptide may itself be neurotoxic 
(99). The regulatory region for the APP gene contains primarily GC signals; 
thus, APP probably is a normally produced protein with a “housekeeping 
function.” The region also contains two regulatory sites that could respond to 
injury, ischemia, and other insults—that is, a heat shock regulatory region 
and a C-fos regulatory region (100). 

One hypothesis that has strong circumstantial support is that the first step in 
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the development of AD, in at least some persons, is the laying down of small 
focal collections of the B-amyloid protein and its precursor protein (APP) in 
“diffuse plaques.” Diffuse plaques precede neuritic plaques and neurofibril- 
lary tangles by some years in at least the form of AD that develops in 
individuals with Down syndrome (101, 102). In Down syndrome, there is a 
triplication of chromosome 21, which results in three gene doses, not the two 
doses normally present on two chromosomes. The gene for APP is on 
chromosome 21 and is close to the beginning of that section of the chromo- 
some involved in the production of Down syndrome. One could, therefore, 
assume that in Down syndrome, unusually large amounts of APP are formed, 
some of which are then broken down to the B-amyloid. Diffuse plaques are 
found as early as age 10 in brains studied at autopsy, whereas the full-fledged 
Alzheimer changes typically do not occur until age 30-40. Thus, a 10-25- 
year lag period may exist between the first production of the diffuse plaques 
and the development of further changes. 

A second condition in which an abnormality of the APP gene produces 
amyloid disease occurs in a particular genetic form of familial AD. These 
particular kindreds have a mutation on the APP gene at a DNA nucleotide 
adjacent to the beginning of the sequence (near the carboxy terminal) for the 
B-amyloid peptide itself (63). This mutation probably makes the breakdown 
or degradation of APP into this peptide more likely to occur, thus leading to 
development of an early onset form of AD with symptoms often manifested 
during the fifth decade of life. Schellenberg et al (64) have provided evidence 
that this particular mutation is rare. 

If head injury is a risk factor for AD (45-51), it could act, in part, by 
leading to the release of B-amyloid, with the production of diffuse plaques 
and eventual later onset of Alzheimer’s. In this regard, the production of 
diffuse plaques might be analogous to various initiation events that precede 
malignancies. In the development of cancer, initial events often require some 
further carcinogenic event to set off the malignant process. For example, the 
first event in the pathogenesis of cervical cancer is infection by particular 
types of papilloma virus. But, a second carcinogenic event is required before 
the cells become malignant. Is this also true of AD? Does the production of 
the diffuse amyloid plaques simply act as an initiating factor, which then 
requires some second event to set off the full disorder? 

During the symptomatic phase of AD, neuritic plaques and neurofibrillary 
tangles are formed, and nerve cells, particularly large neurons, and synapses 
are lost. The process of cell death in AD is not understood, but each 
neurofibrillary tangle is now known to be a degenerating neuronal body that 
contains thousands of abnormal fibrils, which are most commonly present in 
pairs of long, very thin fibers wound around each other in a helical fashion, 
the so-called paired helical filaments (103). These fibrils consist, at least in 
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part, of accumulation of an abnormally phosphorylated form of a group of 
proteins, the Tau proteins. Tau proteins are normally expressed in nerve cell 
bodies during early development and are present in abundance only on axons 
during normal adult life; but, they accumulate in the cell body of neurons 
affected by the Alzheimer process. The “Alz-50” antibody, which has been 
used to identify Alzheimer brains immunochemically, reacts with a specific 
phosphorylated Tau epitope. Thus, the intracellular fibrous proteins in the 
nerve cells in the Alzheimer brain become abnormally phosphorylated. In 
addition to Tau and ubiquitin, otherwise normal neurofilaments present in 
these neurons become abnormally phosphorylated. There are alterations in 
several of the kinases, enzymes that control phosphorylation. Specifically, 
a-2 protein kinase C is markedly reduced (104). Casein kinase is elevated in 
cells associated with neurofibrillary tangles, but is generally reduced in the 
AD brain (105). 

If these changes are essential to the development of the Alzheimer process, 
they suggest possibilities by which the disease might be modified or prevented 
pharmacologically. The laying down or abnormal degradation of APP into 
f-amyloid and the intracellular events, including the abnormal phosphoryla- 
tion, that occur in later stages is of interest to several laboratories, including 
biotechnology and drug firms. Neurotrophic factors may be helpful. The 
cholinergic system in rat brains is controlled by a particular neurotrophic 
nerve growth factor (NGF). Administration of NGF to rats not only protects 
the cholinergic projection system from the basal nucleus of the forebrain to 
the cortex following experimental injury, but also improves the maze learning 
performance of impaired elderly rats. Whether a similar effect would occur in 
primates and humans is of intense interest. 

The basal forebrain cholinergic projection system to neocortical association 
areas and hippocampus has received special attention, since its markers were 
found to be decreased by 70-90% (90, 91). The loss of choline acetyltrans- 
ferase, a marker of cholinergic neurons, correlates well with degree of 
dementia, although not to the degree that loss of synapses does. The early 
finding of this cholinergic deficit led to numerous attempts to treat the disease 
by using cholinomimetic drugs that act directly upon the system or drugs that 
inhibit the breakdown of the remaining acetylcholine that is formed in the 
Alzheimer brain. One of the latter drugs, tetrahydroaminoacridine, whose 
chemical name is now tacrine, has received widespread publicity from a 
recent large national trial; tacrine does seem to produce mild cognitive 
improvement that lasts several months in a subset of Alzheimer patients 
(106). Although not yet well studied in humans, we suspect that an agent like 
NGF, which also acts primarily on this cholinergic system, may ultimately be 
more useful as therapy than cholinomimetic agents; at present, however, there 
is no convincing evidence that existing treatments modify the course of AD. 
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The importance of dementia, a syndrome of global cognitive impairment, has 
gained widespread recognition in the past two decades. The most common 
cause, Alzheimer’s disease, may be the single greatest source of dysfunction 
among persons over age 85. The disease, distinct from normal aging, is 
progressive, has a highly variable course, and can have tremendous impact on 
families. Duration of symptoms averages eight to ten years from onset, four to 
five years from diagnosis. Diagnosis is often delayed, because of the insidious 
nature of the illness. 

Prevalence and incidence rates may vary severalfold between different 
studies. One consistent finding is the dramatic increase with age; prevalence 
rates are 25-48% for persons over age 85. The two most consistent risk 
factors for AD are age and positive family history. Another likely risk factor 
is head trauma. Alzheimer’s disease is almost certainly due to heterogeneous 
causes. 

Biologic understanding of AD is primarily based on study of distinc- 
tive pathologic changes in the brain. Increasingly well-characterized patho- 
logic changes precede the clinical manifestation of disease. Ultimately, 
the pathologic cause of AD is loss of neurons and neuronal connections 
in the brain, especially in the frontal and parietal association neocortex. 
Many systems are involved, but the most affected system is the cholinergic 
system, followed by the noradrenergic and serotonergic neurotransmitter 
systems. 

Drug treatment and other intervention strategies to prevent or delay pro- 
gression of the disease have been limited, primarily because so little is known 
about the cause or risk factors for the disease. Current palliative treatments are 
attempts to minimize the morbidity of abnormal behaviors and medical 
complications associated with dementia. Ideally, treatment would either in- 
volve replacement therapy or drugs, which prevent or delay the pathologic 
changes that occur in AD. Two of the more promising experimental therapeu- 
tic attempts involve interruption of the pathogenetic events involving B- 
amyloid and its precursor proteins and administration of NGF. Public health 
attempts at risk factor reduction are clearly premature. 

The next decade or two will likely witness improved understanding of the 
pathogenesis of Alzheimer’s disease. Epidemiologic study will contribute to 
our knowledge of pathogenesis and may reveal heretofore unrecognized and, 
hopefully, modifiable risk factors. Social and clinical strategies to deal with a 
disease of such immense importance also needs development and evaluation. 
The public’s interest in this disease is intense, and it looks to the scientific 
community for progress in understanding and managing this often tragic 
illness. 
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All parts of the body which have a function, if used in moderation and exercised in labors to 
which each is accustomed, become thereby well-developed and age slowly; but if unused 
and left idle, they become liable to disease, defective in growth and age quickly. 


Hippocrates 


INTRODUCTION 


The prevention of disability and the preservation of independence have 
become the clinical and policy priorities for the health care of older adults 


'The US Government has the right to retain a nonexclusive royalty-free license in and to any 
copyright covering this paper. 
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(84). Aging is characterized by a diffuse loss of physiologic capacity and 
reserve and reduced ability to adapt to challenges. Almost without exception, 
both cross-sectional and cohort studies demonstrate declines in average 
physiologic performance or capacity with advancing age. There had been a 
tendency to assume that such functional declines were genetically pro- 
grammed and, as a result, inevitable. However, in reviewing the various 
declines in function with age found by the Baltimore Longitudinal Study of 
Aging, Shock (77) noted that “the effects of age are highly individual, and 
chronological age alone is a poor index of physiologic function.” 

Investigators have noticed that many older subjects evidenced minimal or 
no decline. Rowe & Kahn (72) suggested that this preservation of function 
represents “successful aging” and that the steep mean declines in function 
assumed to be “normal” or “usual” may not be inevitable consequences of 
growing older. The challenge inherent in this revised view of aging is to 
identify the extrinsic determinants of decline (36, 72) and intervene upon 
those that are mutable. 

In this paper, we examine the observational evidence concerning the effects 
on physical health of one potentially deleterious and mutable extrinsic 
factor—inactivity. Our primary focus is on the evidence linking inactivity to 
disability and frailty or reduced physiologic reserve. We also discuss the 
closely related literature concerning the association in older adults between 
inactivity and depression, fractures, coronary heart disease, and mortality. 

A companion paper (15) considers the experimental evidence that increas- 
ing activity reduces or eliminates functional decline. The experimental data to 
date are limited by short follow-up intervals, highly selected participants, and 
interventions largely aimed at increasing vigorous activities. The observation- 
al evidence reviewed in this paper, therefore, furnishes complementary in- 
formation about the health effects of the full distribution of physical activities 
in broader segments of the population over longer time periods. 

The methodologic limitations of observational studies of activity and health 
have been well described (52). Observational studies are limited by the 
well-known difficulties of validly measuring physical activity (53) and the 
selective forces that lead individuals to adopt various patterns of activity. In 
this review, we pay close attention to the method of measuring activity in key 
studies. Because healthier individuals are more likely to remain active, we 
give greater emphasis to longitudinal studies that can control for differences in 
baseline health status. 


Inactivity and Disability 


Figure 1 depicts the various interrelated pathways by which inactivity can 
produce or accelerate the onset of disability and death. The model is derived 
from several sources (17, 21, 34, 86). Although inactivity increases the risk 
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of disabling health events, such as myocardial infarction, its major impacts on 
functional status are probably more protracted and subtle. We propose that 
inactivity accelerates the rates of decline of major physiologic adaptive 
systems (deconditioning), which eventually reach the point at which the 
individual’s ability to prevent or recover from acute stresses is impaired. The 
aging individual’s ability to cope with such stresses and preserve subsequent 
function depends upon the maintenance of adequate physiologic reserves, 
particularly neurologic control, mechanical performance, and energy 
metabolism (4, 17), and is also assisted by such modifying factors as positive 
affect. 


The Disuse Hypothesis 


Bassey (5) and Bortz (11) noted the similarities between the structural and 
functional declines associated with aging and the effects of enforced inactiv- 
ity, such as bed rest or space flight. They also noted the ability of activating 
interventions, like exercise programs, to slow the rate of decline. Bortz 
defined a “disuse syndrome” (10), characterized by loss of cardiovascular 
reserve, obesity, “musculoskeletal fragility,” and depression, and argued 
forcefully for the potential of exercise to prevent or reverse the syndrome. 

Acute inactivity and deconditioning account for some of the steep declines 
in function seen in hospitalized older patients (37, 38). Whether slowly 
progressive deconditioning due to habitual inactivity contributes to reductions 
in function and reserve is one focus of this review. We hypothesize that the 
deconditioning effects of inactivity account for a substantial proportion of the 
negative slope in functional capacity with increasing age and that programs to 
increase activity can reverse some of these effects. Although most of the 
evidence available pertains to physical activity, inactivity in other aspects of 
life—intellectual, social, interpersonal—might also have adverse health con- 
sequences. 
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Figure 1 The relationship of inactivity to disease, disability, and death: a conceptual model. 
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What Do We Mean by Physical Activity? 


Physical activity is purposeful body movement, whereas exercise refers to 
deliberate efforts to increase activity beyond that needed to perform social 
roles. Investigators have tended to consider activities in terms of their usual 
energy expenditure (calories/unit time) and impact on cardiovascular fitness 
as measured by aerobic capacity. This limited view may obscure other 
dimensions of physical activity that are important to the health of older adults. 
Besides the duration and intensity of energy expenditure, activities also vary 
in such dimensions as the parts of the body affected, the nature and strength of 
the forces acting on those body parts, the degree of central nervous system 
processing required for the activity, and the quantity of endorphins released. 
These additional dimensions may be critical to producing specific health 
effects. For example, activities with similar energy expenditures, but different 
degrees of stress on bones (e.g. swimming and jogging), may have very 
different effects on bone density (79). Characterizing activities based on their 
impact on other physiologic capacities, such as bone density, muscle strength, 
or balance, is a fruitful research direction. In the meantime, estimation of time 
spent doing various specific activities would seem to preserve the most 
information (26). 


Activity Patterns in Older Adults 


The proportion of older persons who are inactive varies with the definition of 
physical activity or exercise used and the population studied. In a well-to-do 
retirement community, only 10% reported no exercise at all (56). However, 
43% of a national sample of individuals 65 years and older were categorized 
as sedentary (20), and 61% of respondents to the Behavioral Risk Factor 
Surveys who were 55 years and older (89) and 53% of a random sample of 
older (age 65+) residents of Nottinghamshire, England, reported no leisure 
physical activity (26). Evidence from repeated Canadian surveys suggests that 
physical activity among seniors has increased over time (19). 
Self-reported physical activity declines with increasing age. Most of the 
decline in physical activity with age involves more strenuous activities (56, 
58). Only about 10% of older adults report such activities as running or 
swimming, which are vigorous enough and performed with sufficient regular- 
ity to meet the Public Health Service’s national objectives (20, 56, 84). 
Substantial proportions of all surveyed senior populations report involvement 
in less vigorous activities, such as gardening and walking, as exercise. In fact, 
these two activities account for much of their total leisure energy expenditure 
when activities are aggregated (15, 26, 56, 58). The vigor of gardening and 
walking is likely to be highly variable, which adds to the difficulties of validly 
assessing activity levels in older individuals. Gardening and walking are 
reported as exercise far less frequently by younger adults. One potential 
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explanation for this age difference in what might be considered as “exercise” 
is the observation that older adults rate the exertion needed to achieve a given 
heart rate elevation more highly than younger counterparts (78). 

What accounts for the reduction in activity, particularly vigorous activities, 
with increasing age? Mobily (57) identified two sets of barriers to exercise in 
the elderly: environmental or normative messages to slow down and personal 
frustrations associated with illness or the perceived difficulty of engaging in 
strenuous exercise. A common perception is that social norms urge older 
people to relax, yet most surveys indicate that older people associate well- 
being and life satisfaction with higher levels of activity (12). However, one of 
the few empiric studies of attitudes about exercise showed that older respon- 
dents viewed participation in more vigorous activities as less appropriate for 
older persons (61). Attitudes about participation in physical activity may be 
changing, particularly among more affluent and educated seniors (56), but 
data are sparse. 


The Effects of Illness on Activity 


The higher rates of ill health among older adults account for some of their 
decline in activity. Ill health appears to be a major reason for drop-outs from 
exercise programs (47). Discomfort may be an important reason; one survey 
found that declines in physical activity among seniors were strongly corre- 


lated with increased symptoms (30). 

A second reason may be that older persons are more cautious than their 
younger counterparts (12) and more likely to reduce activities after acute 
health events for fear of a recurrence or other injury. Fear of falling has now 
been recognized as a threat to health that is, perhaps, more consequential than 
falling itself (55). Fallers and near-fallers experience declines in health and 
increases in health care utilization out of proportion to the extent of the 
injuries incurred (46, 85). We suspect similar scenarios with other health 
problems, although this important issue needs further research. 

Well-meaning efforts by medical providers, family, and friends to protect 
older persons from possible harm related to activity probably contribute to 
activity limitation (12, 51, 88). This tendency to protect the elderly may 
become even more intense in the face of medically diagnosed illnesses, like 
coronary heart disease, diabetes, or arthritis, or visible reductions in the 
steadiness of gait or balance. As activities become more and more restricted, 
physiologic deficits due to deconditioning may become more visible, and the 
process escalates. The health care system may further contribute to the view 
of activity as dangerous through unsubstantiated “community standards,” 
such as those requiring that older adults protect their arthritic joints from the 
stress of exercise (31) or engage in treadmill testing before embarking on a 
moderate exercise regimen (75). 





456 WAGNER ET AL 


We conclude that the impacts of illness or injury on activity go beyond their 
pathophysiologic effects to include fear of harm from activity and, perhaps, 
well-intentioned discouragement of activity by medical providers and others. 


PHYSICAL ACTIVITY AND HEALTH 


The observational evidence linking activity and health in older adults comes 
from two principal study strategies: comparisons of athletes with nonathletes 
and comparisons within more representative populations that exhibit a range 
of activity levels. The former strategy predominates in studies that examine 
the relationship between activity and physiologic indicators, whereas the 
latter comprises most studies that link activity level to disease or functional 
status. A comparison of athletes with nonathietes is, of course, treacherous, 
because of the relationships between long-term favorable physiologic, life- 
style, and health characteristics and sustaining athletic activities. 


Physiologic Capacity or Reserve 


The maintenance of optimal functional status requires adequate physiologic 
capacity and reserve. Badley et al (4) suggested that three areas have the 
greatest relevance for subsequent function: neurologic control, mechanical 
performance, and energy metabolism. 


NEUROLOGICAL CONTROL Maintenance of function requires the perfor- 
mance of complex tasks, which in turn require well-functioning central and 
peripheral nervous systems. Tests of neurologic function are strongly associ- 
ated with current functional status (45, 90), future institutionalization (90), 
and risk of falls (60, 83). The relationship between impaired cognitive 
function, a crucial element in neurologic control, and disability is also clear. 
Does regular physical activity preserve neurologic control in the aging in- 
dividual? 

Several cross-sectional studies have compared reaction and movement 
times in response to various stimuli among individuals of differing ages and 
levels of regular exercise (22, 23, 69, 81). Mostly, these studies suggest that 
older exercisers perform better than older sedentary individuals and, in some 
studies, as well as much younger individuals. Similarly, some studies have 
found that physically active seniors have better cognitive function than in- 
active ones (1, 22, 23). The effects of exercise programs on cognition have 
been less positive, but the programs have been of short duration, and statisti- 
cal power has been limited by small samples (15). 

Spirduso (81) speculated that the effects of activity on psychomotor speed 
might be mediated by either enhancements of cerebral circulation or trophic 
influences on synaptic or neurocellular function. Rogers et al (70) recently 
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supported the first hypothesis. They examined cerebral blood flow and cogni- 
tive function in three cohorts of older volunteers (mean age, 65 years): a 
group continuing to work, active retirees as measured by an activity in- 
ventory, and inactive retirees. The first two groups engaged in similar 
amounts of activity at baseline and maintained their baseline cerebral blood 
flow levels over the four-year follow-up period, whereas the inactive retirees 
experienced a significant decline in cerebral blood flow. Although cognitive 
function was measured only at the end of follow-up, it too was significantly 
lower in the inactive cohort. The second hypothesis, trophic effects, has 
received support from promising efforts to preserve cognition by increasing 
intellectual activity (74). 


MECHANICAL PERFORMANCE 


Muscle strength Reduced muscle strength, particularly in the lower ex- 
tremities, is associated with reductions in functional status and increased risk 
of falling (16). Muscle loss may also account for a substantial proportion of 
the age-related decline in aerobic capacity (33). The relationship between 
physical activity and muscle strength in older adults is controversial; some 
studies relate inactivity to age-related decline (2, 3, 24), and others fail to 
support such a relationship (42). This may not be surprising, as prevalent 
activities in older adults, like gardening and walking, may not include suf- 
ficient resistance exercise to improve strength. Bassey et al (6) did find that 
the calf strength of seniors correlated significantly with measures of walking 
speed, walking quantity (as measured by an accelerometer), and self-reported 
leisure activity. Although suggestive, the impact of less vigorous activities, 
such as walking and gardening, on lower extremity muscle strength remains 
uncertain. This would appear to be a critical question, given the increasing 
recognition of lower extremity weakness as a risk factor for falls (16). 


Bone density Both inactivity and aging are associated with losses in bone 
mineral density, a major risk factor for fractures of the hip, vertebrae, and 
wrist. Bone loss begins in early adulthood, accelerates around the menopause, 
and returns to a stable rate of decline for the remainder of the life span (68). 
Rather dramatic losses of bone mass follow enforced bed rest (48). This loss 
may not be recouped if an older, bed-confined patient does not return to full 
activities. The role of less extreme forms of inactivity on bone density has 
been the subject of considerable speculation and some research. 

Athletes, particularly runners, tend to have higher bone density than 
nonathletes, particularly in bones most affected by their sport. For example, 
Lane et al (50) compared bone density in 14 runners and matched controls 
over age 50. Bone density in the first lumbar vertebra was 40% higher in 
runners than in controls. However, there is very little observational evidence 
as to the effects of less vigorous activities, like walking, on bone density. 
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Two epidemiologic studies have examined the relationship between activity 
and fractures in older adults. Cooper et al (25) compared recent activity levels 
and muscle strength in a community-based group of hip fracture patients with 
matched controls from the same community. They found that activity during 
the six weeks before the fracture, measured by an inventory of weight-bearing 
activities like standing and walking, was associated with reduced risk of hip 
fracture. Because fractures occur in sicker patients, recent activity may have 
been reduced by an illness that predisposed to fall and fracture. Sorock et al 
(80) assessed the relationship between activity and self-reported fracture over 
the next year in a retirement community by asking about involvement in nine 
different exercise activities. Among those participants reporting regular walk- 
ing, the risk of fracture was less than for those who were sedentary. However, 
walking may reduce fractures by preventing falls, rather than by strengthening 
bones. Further studies of the effects of less vigorous activities on bone density 
are sorely needed. 


ENERGY METABOLISM _ Studies of aerobic capacity, or the ability to perform 
work by using oxygen in response to maximal stress (VO2max), have gener- 
ated much speculation and research about age-related functional decline and 
the role of activity in that decline. Because aerobic capacity quantitates the 
ability to work, it should be closely related to functional status, and some 
evidence indicates that it is. For example, reduced VO2max limits the speed 
of walking or stair climbing (28, 41). 

Several studies of older athletes suggest that those remaining active in their 
sports experience a slower decline of VO.max than their more sedentary 
counterparts (76). However former athletes who have given up their sport 
decline at the usual rate. Buchner & Wagner (18) reviewed six longitudinal 
studies (27, 28, 39, 40, 44, 65) that compared changes in VO.max over time 
in active and sedentary older adults. In every study, the rate of decline was 
lower in the active cohort, who declined at annual rates of 0.24—-0.78 mli/kg/ 
min, than in the sedentary control individuals, who declined at rates of 
0.30—1.62 ml/kg/min annually. The active subgroups in most of these studies 


participated in intense exercise, again leaving unsettled the impact of less 
strenuous activity. 


Modifiers: Depression 


Depression has powerful adverse effects on physical health and functional 
status (87). Inactivity is associated with negative affect and depressive symp- 
toms, but the direction of the relationship has been uncertain, because activity 
reduction is a well-recognized manifestation of depression. Two recent pop- 
ulation-based cohort studies provide stronger evidence that inactivity can 
precede depressive symptoms (18, 32). Farmer et al (32) used data 
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from the National Health and Nutrition Examination Survey to examine the 
association between reported physical activity in 1975 and depressive symp- 
toms on resurvey in 1982-1984. Women who reported little or no recreational 
activity at baseline had significantly more depressive symptoms than more 
active women. Among members of the Alameda Population Laboratory 
cohort (18), those with low leisure activity levels at baseline were significant- 
ly more likely to report depressive symptoms nine years later, and those 
whose exercise levels declined over the nine-year interval had more de- 
pressive symptoms a decade later. 


Coronary Heart Disease 


Physical activity may affect the risk of coronary heart disease (CHD) through 
its direct effects on cardiac function (e.g. by increasing cardiac output, stroke 
volume) and through its indirect effects on levels of coronary risk factors (e.g. 
lipids, blood pressure, glucose metabolism, body composition) (52). Several 
recent reviews evaluated the strength of the epidemiologic evidence that 
relates physical activity levels to the occurrence of coronary outcomes mostly 
in middle-aged men (8, 62, 66). Berlin (9) found overall relative risks for the 
association of moderate or sedentary compared with high levels of nonoccu- 
pational physical activity to be 1.5 for CHD incidence (95% confidence 
interval, 1.4~1.7) and 1.6 for CHD death (95% confidence interval, 1 .1—2.4). 
These studies provide very little information on the effects of physical activity 
on CHD risk in older men and virtually no information on older women. 

One exception is the Honolulu Heart Program cohort, which included men 
aged 65 and older (29). Physical activity was measured with an index created 
by multiplying the estimated oxygen consumption associated with basal, 
sedentary, slight, moderate, or heavy activities by the number of hours spent 
doing each activity during a usual day. Relative risks for the incidence of 
CHD during 12 years of follow-up, which compared men in the lowest tertile 
(inactive) with those in the highest tertile of physical activity (active), were 
1.5 for men aged 45-64 (95% confidence interval, 1.1—1.9) and 2.3 for men 
aged 65 and older (95% confidence interval, 1.0—5.3). In older, retired men, 
for whom the physical activity index would reflect leisure-time activity 
exclusively, the relative risk among inactive men was 3.4. 

The effects of exercise on CHD may be explained, at least in part, by its 
impacts on lipids. Epidemiologic studies have consistently shown that physi- 
cal activity improves lipid profiles in older adults (67, 73, 91). In general, 
those who exercise regularly at moderate intensity levels or higher have 
higher high density lipoprotein (HDL) cholesterol levels and lower low 
density lipoprotein cholesterol and triglyceride levels. One study (67) found 
that in women, the highest HDL cholesterol levels were seen in light to 
moderate intensity exercisers. 
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Other Chronic Diseases 


In another analysis of initially healthy Honolulu men from the Heart Program 
cohort, the physical activity index was associated with remaining free of more 
than eight major chronic conditions during a 12-year follow-up period after 
adjustment for age (7). Additional adjustment for body mass index, blood 
pressure, and other risk factors diminished the association between physical 
activity index and remaining healthy, which suggests that the association 
between physical activity and major chronic diseases may be mediated by 
favorable levels of other risk factors. 

Disability 

The investigation of the effect of activity on functional outcomes is relatively 
new; initial studies have been conducted within the context of ongoing 
prospective studies of older adults. Four studies have directly examined the 
association between baseline physical activity and subsequent function. Three 
have shown compelling associations between physical activity and maintain- 
ing function (13, 49, 59), whereas the fourth found no effect of baseline 
activity status (64). 

Branch (13) found that men and women aged 65 and older, who reported 
having slowed down their physical activities, were twice as likely to have 
functional disabilities five years later. Mor et al (59) used data from the 
Longitudinal Supplement on Aging, a national probability sample of older 
adults, to investigate factors associated with functional decline in persons 
aged 70-74. Two indicators of physical activity were available: engaging in a 
regular routine of physical exercise and walking a mile or more without 
resting. Among men and women who were functionally intact at baseline, 
inactivity was associated with a 50% greater risk of losing function during a 
two-year follow-up period. A regular routine of physical exercise was the 
more predictive item for men, whereas the frequency of walking a mile or 
more was the more predictive item for women. 

In three communities of the National Institute on Aging Established Pop- 
ulations for Epidemiologic Studies of the Elderly, 6981 older men and women 
with intact mobility at baseline (i.e. able to climb stairs and walk a half mile 
without help) were followed annually for four years to determine factors 
associated with maintaining mobility (49). The frequency of three types of 
physical activity was examined: walking, gardening, and doing vigorous 
exercise. For each of the three activities, age-adjusted rates of maintaining 
mobility were highest in those who engaged in the activity three or more times 
per week and lowest in those who rarely or never engaged in the activity. 
None of the activities was clearly superior to the others in maximizing the 
rates of maintaining mobility. Based on a composite index of physical activ- 
ity, which summed the frequency of all three activities, men and women with 
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high activity levels were 40% less likely to lose mobility during the four years 
of follow-up than those with low activity levels. 

Conversely, analysis of the Framingham data showed no association be- 
tween an activity index based on hours spent in sedentary, slight, moderate, 
and heavy activity and a cumulative disability index assessed 21 years later 
(64). The lengthy interval between activity measurement and assessment of 
outcome raises the question as to whether the baseline measure of activity 
indexed a consistent pattern over the follow-up interval. Sustained activity 
appears to be necessary for health benefits (62). 


Mortality 


Several studies have examined the association between activity and mortality 
in older adults. Table | summarizes the results of eight recent prospective 
studies. Of the eight studies, four focused on simple dichotomies by compar- 
ing some regular physical activity with sedentary lifestyle (35, 43, 71) or 
slowed down activities with not slowed down (14). Inactivity was associated 
with a 30-40% increased risk of death in two of the studies (43, 71), and a 
threefold increased risk of death in the third (35). A fourth study of Swedish 
men born in 1913 found a twofold increased risk of death during 15 years of 
follow-up among those who rated their physical fitness level as bad or very 
bad on a seven-point scale, compared with men who reported excellent 
physical fitness levels (82). These studies provide no information as to the 
type, amount, or intensity of activity associated with the protective effect. 

In contrast, a study of Harvard alumni measured physical activity in terms 
of energy expenditure based on number of blocks walked, stairs climbed, and 
time spent in sports play (63). Men aged 60-84 who expended more than 
2000 kilocalories per week, compared with those who expended less than 
500, were only half as likely to die during 12-16 years of follow-up. 
Similarly, physical fitness, as measured by total treadmill time, was strongly 
associated with mortality during a 4—-15-year follow-up in initially healthy 
men and women aged 60 and older (9). In this study, the association between 
physical fitness and mortality was stronger in older adults (>60 years) than in 
the younger age groups (20-59 years). Social activity, as measured by such 
leisure time activities as going to movies, dancing, sightseeing, and picnick- 
ing, was not associated with risk of death in an eight-year study recently 
reported (54). 

This evidence suggests that regular physical activity reduces the risk of 
death in persons aged 60 and older. Paffenbarger & Hyde (62) emphasize that 
physical activity must be a current practice to be beneficial; exercise in 
mid-life that is discontinued in late life is of no benefit, whereas exercise 
initiated in late-life, even after a sedentary middle-age, may result in sub- 
stantial gains in life expectancy. Too few studies have focused on the relation- 
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ship between social activity and mortality in older adults to draw firm 
conclusions. 


CONCLUSIONS 


Older adults reduce their activity levels as they age, and larger proportions are 
sedentary. Planning feasible interventions requires a far better understanding 
of the determinants of this behavior pattern than currently exists. We must 
determine the extent to which well-meaning advice by formal and informal 
caregivers leads to activity reductions following falls, near-falls, other acci- 
dents, or illnesses. 

Whatever the reason for its occurrence, strong and consistent evidence 
indicates that chronic inactivity has important adverse health consequences. 
The studies also provide encouraging evidence that even modestly increased 
physical activity levels in older adults may have major public health benefits. 
Increased activity in older adults appears to result in diminished age-related 
declines in physiologic reserve, fewer depressive symptoms, reduced risk of 
CHD, fewer osteoporotic fractures, higher rates of maintaining function and 
avoiding functional loss, and lower mortality. 

The method of measuring physical activity varies widely across studies, as 
does the level of activity at which health benefits begin. Future research must 


answer crucial questions about the type, intensity, and duration of activity 
required to achieve various health effects. Whether increasing physical activ- 
ity levels in previously sedentary older adults will achieve the same health 
benefits as naturally selected activity patterns is a crucial and testable hypoth- 


esis for experimental studies. Existing studies are discussed in the companion 
article (15). 
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INTRODUCTION 


Older adults often say “staying active” is important to healthy aging. Physical 
activity is usually emphasized, but intellectual and social activities are also 
important. A growing body of scientific evidence addresses this subject. In 
the preceding article, we critically reviewed the epidemiologic evidence that 
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activity patterns are associated with health (96). Here, we discuss the ex- 
perimental evidence that interventions that “activate” older adults promote 
their health and maintain or improve physical and mental functioning during 
normal daily activities. 

The distinction between physiologic and functional status effects of ex- 
ercise is critical to this review. The International Classification of Im- 
pairments, Disabilities, and Handicaps definitions of impairments and dis- 
abilities embody this distinction. An impairment is a “loss or abnormality of 
psychological, physiological, or anatomical structure or function” (26). A 
disability is a “restriction or lack . . . of ability to perform an activity in the 
manner or within the range considered normal for a human being” (26). A 
great deal is known about the therapeutic role of exercise on physiologic 
impairments in such diseases as ischemic heart disease, hypertension, nonin- 
sulin dependent diabetes mellitus, chronic obstructive pulmonary disease, 
obesity, and depression. We do not discuss these areas in any detail, as they 
are covered by other review articles and in Bouchard et al’s (18) recent text 
book. Much less is known about exercise and “functional status.” 

We begin by considering the physiologic measures of physical fitness, 
evidence that exercise can improve fitness in older adults, and theoretical 
reasons why exercise should improve functional status in older adults. We 
review experimental evidence that addresses whether exercise improves func- 
tional status in older adults and focus on balance, gait, and physical health 
status; cognitive status; and rate of bone loss. 


EXERCISE AND PHYSICAL FITNESS 


There are five common measures of physical fitness (86): muscle strength and 
endurance, flexibility, body composition, anaerobic capacity (abilities), and 
aerobic capacity (abilities). Exercise affects these measures of fitness in 
healthy, younger adults. Because age-related decline in strength and aerobic 
capacity is hypothesized to be important to disability, we focus on these 
measures. Joint flexibility, body composition, and anaerobic capacity also 
show age-related changes, but have received little attention as to their role in 
the pathogenesis of frail health. 


Aerobic Capacity 


DEFINITION AND MEASUREMENT Aerobic capacity can be defined as the 
ability of the body to produce energy by using oxygen. It is a principal 
measure of the ability of the body to do sustained work (86). It is usually 
assessed as maximal aerobic power, or VO2 max, and measured as (maximal) 
milliliters of oxygen consumed per kilogram of body weight per minute 
(ml/kg/min), or as metabolic equivalents (METS) (1 MET = the rate of 
oxygen consumption at rest, about 3.5 ml/kg/min) (56). 
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AGE-RELATED DECLINE Much attention has been paid in gerontology to the 
decline in aerobic capacity with age (13, 20, 37, 57, 83). Between the ages of 
30 and 80, about 50% of aerobic capacity is lost. Even so, variation in aerobic 
capacity is large enough that the range in older adults overlaps that of younger 
adults (55, 62, 72). Absolute rates of decline are higher in sedentary adults 
than active adults (23, 37, 57). A recent study estimated that exercising adults 
lose 0.25 ml/kg/min in aerobic capacity each year, which is one-third the 
yearly loss rate of 0.71/ml/kg/min for nonexercisers (57). 


EFFECTS OF EXERCISE Many studies report that aerobic exercise improves 
aerobic capacity in older adults (1, 4, 9, 10, 14, 17, 21, 33, 35, 39, 41, 48, 
53, 61, 69, 70, 73, 79, 81, 84, 91, 93). Improvement with 3-12 months of 
exercise is modest and ranges from 5% to 20%. 


Skeletal Muscle Strength 


DEFINITION AND MEASUREMENT _ Strength can be defined as the maximum 
force exerted by a muscle (38). Strength is not just a property of muscle, but 
depends upon neurological function (44). Strength can be measured as the 
maximum weight lifted (isotonic strength), as the maximum force exerted 
against a fixed object (isometric strength), or as the peak torque produced at a 
given speed of muscular contraction (isokinetic strength). 


AGE-RELATED DECLINE Age-related decline in strength is well docu- 
mented. Typical cross-sectional data suggest a 30-40% loss of back, leg, and 
arm strength between ages 30 and 80 (50). Longitudinal studies suggest that 
the rate of decline is curvilinear and underestimated by cross-sectional data 
(32). For example, longitudinal studies show a 60% loss in grip strength 
between ages 30 and 80 (32). Other longitudinal studies report that healthy 
older adults lose 10-25% of their quadriceps strength in seven years (6, 8). 


EFFECTS OF EXERCISE There are fewer studies of the effects of strengthen- 
ing exercise in older adults than of aerobic exercise (2, 4, 5, 7, 27, 28, 33, 45, 
46, 49, 58, 60, 67, 70, 71). Most studies are small, nonrandomized trials of a 
few months of resistance training, although a randomized controlled trial was 
recently reported (28). Almost all studies report that resistance training 
increases the strength of older adults (46, 91). Earlier studies of low and 
moderate intensity resistance training reported modest increases (10-25%) in 
strength with exercise (7, 60, 67, 71). More recent studies in healthy adults 
(28, 49) and frail adults (46, 47) showed that more vigorous exercise produces 
far greater gains in strength (100—200% in a three-month training program). 
Because elderly adults are often relatively weak, expressing improvement in 
terms of percent gain obscures the fact that absolute gains in strength are 
modest. 
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Although resistance exercise can increase strength in older adults, the 
mechanism underlying the gain is debated. An early article argued that 
although older adults increased strength with exercise, their muscles did not 
hypertrophy (67); rather, neural factors (learning effects) could account for 
the increased strength. Later studies showed that resistance exercise can cause 
muscular hypertrophy (46, 49), yet neural effects may still be important. The 
cross-sectional area of a muscle and its strength are highly correlated in 
younger adults (63) and probably in older adults, as weil (75). However, 
short-term training programs typically increase strength far more than muscle 
cross-sectional area (63). This finding also appears true in older adults, as 
studies report that 10-20% increases in muscle area are accompanied by 
100-200% increases in strength (46, 49). Thus, the excess strength, not 
accounted for by hypertrophy, may be due to various neural factors (68). 


THEORETICAL RELATIONSHIP BETWEEN FITNESS 
AND FUNCTIONAL STATUS 


The pathogenesis of disability is complex. Part I of this series describes a 
model of disability. We describe below in more detail the theoretical relation- 
ship between fitness and functional status and the rationale for why exercise 
should improve functional status. 


Physical Functional Status 


The usual mechanism involved to explain why exercise should improve 
functional status focuses on aerobic capacity (20, 82, 97). The energy needed 
to do a given task can be estimated by measuring oxygen consumption during 
steady-state performance of the task. For example, walking on a level grade at 
5 km/h requires 3.2 METS (56). Loss of aerobic capacity with inactivity or 
illness eventually causes aerobic capacity to fall below the level required for 
daily tasks. Because exercise can increase aerobic capacity, it should improve 
functional status when aerobic capacity is below the threshold needed for 
daily activities. 

There is a roughly parallel explanation focusing on strength (22, 97). The 
amount of strength needed to perform a task can be estimated from biome- 
chanical studies. To stand up, for example, the typical person requires about 
120 Newton-meters of knee torque to transfer his body weight from a chair to 
his lower extremities (59). Suppose strength falls below the level required for 
standing up. Increasing maximal strength should improve function, because 
of an elegant logarithmic mathematical relationship between peak strength 
and endurance at submaximal tasks (38, 85). Suppose maximal strength for 
about one second is 20 kg. If, after an exercise program, maximum strength 
improves to 40 kg, the person should be able to lift the 20 kg weight for 60 
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seconds. That is, increasing strength should improve performance on sub- 
maximal tasks of daily life. 

Note that both explanations involve a threshold effect. Levels of fitness 
below the threshold are associated with impaired ability to do the activity. If 
fitness exceeds the threshold, the activity can be performed. Figure 1 illus- 
trates this relationship. 


Cognitive Functional Status 


The argument that exercise that improves cognitive status depends mainly 
upon epidemiologic evidence that older adults who exercise perform better on 
neuropsychological tests. Our preceding article comments on this evidence. 
The physiologic mechanism underlying the association is unclear. Exercise 
could improve cognition by improving either blood flow to the brain (78) or 
oxygen metabolism in the brain (41). 


INTERVENTION STUDIES 


Exercise Effects on Gait, Balance, and Physical Function 


Several intervention studies address the effect of exercise on gait and balance 
in older adults (Table 1). Seven studies of gait and balance reported some 
Statistically significant improvements attributable to exercise (11, 34, 45, 46, 
54, 76, 95). Another positive study is not included in Table | because it did 


Study 2 
Study 1 Near-Frail Study 3 
Frail Adults Adults Healthy Adults 
ti os 


Normal 





Functional Threshold 
Status Measure 
(e.g. gait speed) 


Poor 








Measures of Physical Fitness 
(e.g. strength) 


Figure 1 Theoretical relationship between physical fitness and functional status. The curvilinear 
relationship shows a threshold effect: above the threshold level of fitness, functional status is 
normal; below it, function is impaired. A curvilinear relationship implies that the benefit from 
exercise depends upon the target group. Three hypothetical exercise studies are shown. Each 
study produces the same absolute improvement in fitness. In the frail adults of Study 1, exercise 
produces a large improvement in functional status. In the healthy adults of Study 3, no benefit is 
seen. Study 2 shows intermediate benefits. 
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not report detailed data (77). Three studies reported no significant effects of 
exercise on gait and balance (19, 31, 52). Another study, not included in 
Table 1 because it did not report detailed data, was also negative (12). 

Several factors may account for the variation in findings among the studies 
in Table 1. If there is a curvilinear relationship between fitness and functional 
status (Figure 1), exercise should not improve gait and balance in adults 
whose fitness levels exceed a certain threshold. Variation in fitness levels 
among study samples could explain differences in results. Possibly, the 
negative studies had an inadequate exercise stimulus. A counter argument is 
that exercise may not need to improve fitness to improve function. 

Limited statistical power may explain differences in results and is reason to 
regard the results in Table 1 as encouraging. In studies in which effect sizes 
could be calculated, exercise produced an effect size of 0.15 or greater for 
76% (13/17) of the outcomes studied. 

The studies in Table | provide only preliminary evidence that exercise 
improves gait and balance in older adults. The lack of randomized trials is 
important: Subjects may show learning effects on clinical gait and balance 
measures (51). Most studies lacked blinded outcome measures and had small, 
nonrepresentative study groups. As studies lacked follow-up, it is uncertain 
whether exercise can produce sustained effects on gait and balance. Ongoing 
research, such as the Frailty and Injuries: Cooperative Studies of Intervention 
Techniques (FICSIT) initiative funded by the National Institute on Aging, 
should help clarify the situation. For example, the University of Washington 
FICSIT study, along with additional research funded by the Centers for 
Disease Control, compares the effects of six different types of exercise in a 
single-blinded, randomized controlled trial involving over 180 subjects. 

We did not find exercise studies restricted to older adults that carefully 
measured functional status outcomes other than gait and balance. Exercise 
studies in arthritis patients have measured functional status, although most 
have enrolled both young and old adults. Because of the pain and joint 
limitations of arthritis, the public health importance of exercise is enhanced if 
exercise is safe and effective in arthritis patients. 

Three studies of exercise in rheumatoid arthritis and osteoarthritis patients 
included follow-up outcome measures (Table 2). All reported improvement in 
functional status because of exercise. Improvements in function persisted at 
follow-up three months (42), eight months (47), and nine months (64) after 
discharge from supervised exercise classes. Studies typically reported 10— 
25% improvements in outcomes at follow-up compared with baseline. Nota- 
bly, exercise did not make arthritic pain worse. Two studies reported that 
exercise relieved pain symptoms. 

The studies in Table 2 provide relatively strong evidence that exercise 
improves functional status in arthritis patients. Studies had long-term follow- 
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up, and one study was single-blinded (47). Study samples were somewhat 
different in age and arthritis symptoms, which suggests that results may be 
generalizable to broad population groups. Arthritis patients are not physically 
fit and have reduced aerobic capacity (65) and muscular strength (47, 54). By 
limiting the sample to arthritis patients, studies focused on a target group 
capable of showing considerable improvement in functional status in a short 
period. 


Exercise Effects on Cognition 


Several experimental studies address the obvious chicken and egg issue: Does 
exercise make people smarter, or do smarter people simply exercise more? 
The first five studies in Table 3 (11, 40, 41, 74, 92) reported some significant 
improvement. The next five studies (17, 43, 61, 66, 70) were more rigorous 
and reported no improvement. There are several possible explanations for the 
mixed results. As noted above, it may be important to target not just sedentary 
adults, but specifically physically unfit adults. The exercise protocols varied 
in type, duration, and intensity. Samples varied from community adults to 
institutionalized mental patients. 

A common interpretation of the negative studies is that short-term exercise 
is not sufficient (61, 70). Adults must exercise for long periods of time to 
show cognitive benefits. This interpretation is supported by the last study in 
Table 3 of a three-year exercise program (76). The study was not a random- 
ized trial, but the investigators reported a modest, though significant, effect of 
exercise on reaction time. 

Another interpretation is that existing studies lack statistical power and 
cannot reliably detect an effect of exercise on neuropsychological tests. 
Statistically significant improvement in test performance because of exercise 
ranged in magnitude from 3% to 35% (next to last column, Table 3). But 
studies reported nonsignificant trends of the same magnitude (last column, 
Table 3). No one has argued that exercise should have large effects on 
cognitive function as measured by neuropsychological tests. It seems more 
likely that exercise effects are modest. 

In summary, the existing literature is insufficient to prove or exclude the 
possibility of a modest effect of exercise on cognition. An upward shift of the 
population mean IQ by a few points would represent an important, if not 
remarkable, effect of exercise. Large, long-term, well-designed, well- 
targeted exercise studies are needed that have the statistical power to detect 
modest effects of exercise on cognition. 


Exercise Effects on Osteoporosis 


The increased risk of osteoporosis with inactivity, particularly bed rest, has 
long been recognized. Later, exercise per se was identified as a protective 
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factor that decreases fracture risk (25). Many reviews of this subject exist, 
including a recent review of both animal and human evidence (80) and a 
critical review by Block (16). 

Given the large amount of interest in this subject, there are surprisingly few 
randomized trials. Table 4 shows nine studies in postmenopausal women that 
had at least one year of follow-up, only two of which are randomized 
controlled trials (15, 29). Six studies (29, 30, 36, 87-90) reported a positive 
effect of exercise on bone mass. Two studies (3, 24) were reported as 
negative, but had small sample sizes and little statistical power. The only 
large randomized trial was reported as negative (15); its walking intervention 
was of modest intensity, which may partly explain this result. 

An interesting aspect of the studies is that bone mineral density was often 
measured at the radius. Changes in radial bone density would be interpreted as 
generalized effects of exercise, as exercise programs did not usually focus on 
the wrist. One study reported that strength training with aerobic exercise may 
increase bone density more than just aerobic exercise alone (29). 

It is widely believed that exercise promotes bone strength in postmenopaus- 
al women. Part I of this series shows that research findings from observational 
studies support this conclusion, yet there is a lack of evidence from random- 
ized trials. Randomized trials are needed particularly to rule out the possibility 
that past exercise is responsible for the apparent benefit of exercise in later 


years. Investigators must systematically study how type and intensity of 
exercise affect bone strength. Eventually, studies will need to test whether 
exercise reduces fracture rates. 


SUMMARY 


This review has focused on a specific part of the relationship of exercise to 
health. The overall evidence supporting the health benefits of exercise is 
substantial and has been critically reviewed recently (18, 94). Thus, the 
United States Preventive Services Task Force recommends that all adults 
exercise regularly (94). The conclusions summarized below regarding older 
adults do not affect this basic recommendation. 

There is solid evidence that exercise can improve measures of fitness in 
older adults, particularly strength and aerobic capacity. These exercise effects 
occur in chronically ill adults, as well as in healthy adults. Because physical 
fitness is a determinant of functional status, it is logical to ask whether 
exercise can prevent or improve impairments in functional status in older 
adults. 

The evidence that exercise improves functional status is promising, but 
inconclusive. Problems with existing studies include a lack of randomized 
controlled trials, a lack of evidence that effects of exercise can be sustained 
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over long periods of time, inadequate statistical power, and failure to target 
physically unfit individuals. 

Existing studies suggest that exercise may produce improvements in gait 
and balance. Arthritis patients may experience long-term functional status 
benefits from exercise, including improved mobility and decreased pain 
symptoms. Nonrandomized trials suggest exercise promotes bone mineral 
density and thereby decreases fracture risk. Recent studies have generally 
concluded that short-term exercise does not improve cognitive function. Yet 
the limited statistical power of these studies does not preclude what may be a 
modest, but functionally meaningful, effect of exercise on cognition. 

Future research, beyond correcting methodologic deficiencies in existing 
studies, should systematically study how functional status effects of exercise 
vary with the type, intensity, and duration of exercise. It should address issues 
in recruiting functionally impaired older adults into exercise studies, issues in 
promoting long-term adherence to exercise, and whether the currently low 
rate of exercise-related injuries in supervised classes can be sustained in more 
cost-effective interventions that require less supervision. 
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INTRODUCTION 


Among persons aged 65 years or older, falls are the leading cause of death 
from injury (66, 79). Major morbidity from falls includes more than 230,000 
hip fractures per year among persons in this age group (National Center for 
Health Statistics 1987, unpublished data). The cost of falls among older 
persons is enormous, because of the high death toll, numerous disabling 
conditions, and extensive hospital stays; nearly $10 billion of the $158 billion 
lifetime economic cost of injury to our nation can be attributed to falls among 
older persons (79). Moreover, falls pose a particular problem for public health 
professionals in the development of both surveillance systems and prevention 
strategies. 

To understand the concepts of fall prevention, one must also understand the 
concepts of injury control. In this article, I discuss from a public health 
perspective the concept of injury as a disease, the extent of the problem of 
falls among older persons, current concepts on the etiology of falls, the need 
for better surveillance, and how understanding these needs and concepts could 
lead us to develop a systematic approach to fall prevention. 


'The US Government has the right to retain a nonexclusive royalty-free license in and to any 
copyright covering this paper. 
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INJURY AS A DISEASE 


By most measures, injury ranks as one of the most serious public health 
problems in the United States today (15). Although the human and financial 
costs of injury in our society are very high, support for injury control has 
lagged far behind support for other public health problems (14, 79). Injuries 
occur at such great numbers that, until recently, they have been tacitly 
accepted as a normal occurrence of living in a modern society. Fortunately, in 
the 1985 report, /njury in America, the Committee on Trauma Research of the 
National Research Council and the Institute of Medicine proposed a national 
plan for injury control that focused on a public health approach to reducing 
injuries (14). Committee members understood that, like other diseases, injur- 
ies could be viewed as a problem in medical ecology—that is, as a relation- 
ship between a person (the host), an agent, and the environment. Unlike these 
other diseases, however, the underlying agent of injury is not a microbe or 
carcinogen. Instead, the agent is energy, most often in the form of mechanical 
force (37). 

Injury should be considered a disease that has a short latency period. It 
results from the acute, rapid exposure to energy (mechanical, thermal, chemi- 
cal, electrical, or radiation) or from the absence of specific body needs, such 
as oxygen or heat (5). The dose of energy received, the dose’s distribution, 
duration, and rapidity, and the human’s response to the transfer of the energy 
can determine whether an injury occurs or is prevented (14). For example, a 
large mechanical energy load quickly transmitted during a fall involving an 
older person may damage cells, tissues, and other structures, thus resulting in 
a fracture. If the same energy load could be transmitted at a slower velocity or 
dissipated over a much larger area, different responses could be mobilized, 
thus resulting in the prevention of injury during the fall. 


EPIDEMIOLOGY OF FALLS 


Definitions and Classification Schemes 


The Kellogg International Work Group, from whose work most fall defini- 
tions are derived, defined a fall as “an event which results in a person coming 
to rest inadvertently on the ground or other lower level and other than as a 
consequence of the following: sustaining a violent blow; loss of con- 
sciousness; sudden onset of paralysis, as in a stroke; or an epileptic seizure” 
(45). Unfortunately, that and most of the derived definitions of a fall are 
clinically or research oriented; require extensive interviewing; are unwieldy to 
use in a public health setting; are subjective and, thus, allow differences in 
interpretation for each study setting; and are likely to miss a substantial 
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number of falls, if the data are acquired through record review and abstract- 
ing. 

Developing a definition useful to public health officials is difficult, because 
a fall is not a disease. Rather, a fall is often a syndrome, which represents 
symptoms and signs of disordered function in a disordered environment. For 
example, a fall might be a direct result of underlying cardiovascular or 
musculoskeletal disease. Depending on the amount of energy transferred, a 
fall itself might lead to a disease (e.g. hip fracture, traumatic brain injury) or, 
more often, might never attract medical or public health attention. 

Various state and community public health programs have demonstrated 
that effective intervention strategies can be implemented by using available 
data on the external causes of injury, as defined by the /nternational 
Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9- 
CM) (33, 64). ICD-9-CM contains a standard coding system that describes 
diseases and the anatomical nature of injuries (N-codes). A supplemental 
volume, titled External Causes of Injuries (E-codes), describes the circum- 
stances and location of the injury and is extremely useful for public health 
practitioners to quantify the problem of falls in their communities. 

Falls can be coded according to the external causes of the injury (codes, 
E880—E888); however, a further definition is needed for those falls that result 
in nonfatal injury. The terms “fall injuries” and “fall-related injuries” are 
widely used, but are ambiguous, as they are often used to describe the type of 
anatomical injury (e.g. hip fracture, brain injury), multiple types of anatomi- 
cal injuries during the same fall, or multiple fall events with at least one 
injury. To reduce this ambiguity, I suggest using the terms “fall injury event,” 
which is the occurrence of a fall that resulted in at least one anatomical injury, 
and “fall injury,” which is the type of anatomical injury sustained during the 
fall (such as hip fracture, skull fracture, superficial injury) (89). For those fall 
injury events that result in more than one anatomical injury, some researchers 
have developed hierarchies of fall injuries, whereby the most severe injury 
receives the top priority status for reporting (29, 89). 

The multifaceted, multifactorial nature of falls has prompted attempts to 
classify falls by etiology, that is, to link specific risk factors or biologic 
measurements to specific types of fall (6, 9, 41, 47, 62, 67, 68, 93, 105). 
These classifications, however, are based on interviews with case patients or 
abstracts of their medical records about the circumstances of falls. Therefore, 
they are subject to recall and interviewer bias and have led to a lack of 
consistency in the literature on the association of risk factors and falls (87). 
Some examples of these classification schemes include unexplained falls 
versus falls with a self-evident etiology (e.g. syncope, seizure, stroke); and 
falls due to host (intrinsic) factors versus falls due to environmental (extrinsic) 
factors. Although these schemes might be useful in a clinical setting, they are 
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less useful in a public health setting, because data of this extent are not readily 
available. 


Incidence 


In 1988, 9060 fatal falls (codes, E880-E888) occurred among persons aged 
65 years or older (National Center for Health Statistics 1988, unpublished 
data). Nearly 60% of fatal falls occur in the home or in a residential institution 
(90). Although falls are the leading cause of death due to injury among older 
persons, this effect is mainly caused by its impact among those 85 or older 
(Table 1). More than one half of injury-related deaths of women and one third 
of men aged 85 or older are due to falls. The rate of deaths due to falls rises 
rapidly with increasing age for all race-sex groups aged 75 or older (Figure 1). 
White men aged 85 or older have the highest death rates associated with falls. 

Pulmonary embolism is strongly associated with deaths due to falls and 
occurs in nearly 13% of such deaths (60). This association increases with 
increasing age and is much greater among those persons with fractures of the 
lower limbs, including femoral neck. Among those deaths from falls with no 
fracture listed, the prevalence of pulmonary embolism is 2.4%. 

Falls can also lead to significant morbidity in older individuals. About 7% 
of persons over age 75 visit hospital emergency rooms for a fall injury event 
each year (29). Falls account for nearly 70% of all emergency room visits to 
treat injuries in this age group (29). In a recent study in South Miami Beach, 


Florida, investigators found that the rate of nonfatal fall injury events in- 
creased steadily by each five-year age group for those aged 65 or older, 
reaching a high of 138 per 1000 for men and 159 per 1000 for women aged 85 


Table 1 Number of deaths due to injuries for persons aged 
65 years or older, by cause, sex, and age group, United 
States, 1988* 








Sex and age (years) 
Men Women 
Cause” 65-84 85+ 65-84 85+ 











Falls 2,459 1,410 2,444 2,747 
Motor vehicles 3,583 495 2,718 363 
Drowning 315 39 119 43 
Fires/Burns 707 168 566 164 
Poisonings 322 53 290 132 
Homicide 694 59 498 79 
Suicide 4,672 498 1,086 107 
Other _3,401 954 2,494 1,374 


Total 16,153 3,676 10,215 5,009 





“Source: National Center for Health Statistics, Detailed Mortality 
Tapes. 
» According to the ICD-9-CM. 
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Figure 1 Rates of deaths due to falls, by age and race-sex groups, in the United States, 1988. 
years or more (Figure 2). Of those fall injury events identified through the 
acute care setting, more than 40% resulted in hospital admissions, with an 
average length of stay of 11.6 days overall (89). 

One specific injury type, hip fracture, increases exponentially by age in 
older persons, from 28 per 10,000 persons aged 65-74 to 251 per 10,000 


Rate per 1,000 








65-69 70-74 75-79 80-84 85+ 
Age Group (Years) 


Figure 2 Rates of fall injury events, by age and sex. Study to Assess Falls Among the Elderly, 
Miami Beach, Florida, 1985-1987. 
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persons aged 85 or older (82). For persons aged 65 or older, the rate of hip 
fracture among white women is about twice that of white men, and white 
persons have about twice the rate of hip fracture as persons of all other races 
(4, 24, 82). More hip fractures occur in winter than summer, but this seasonal 
variation occurs regardless of latitude (42). Data from the National Hospital 
Discharge Survey, National Center for Health Statistics, indicate that in 1987, 
233,432 hip fractures occurred among persons aged 65 or older. Of these, 
13,138 (5.6%) resulted in death during hospitalization. If 90% of these hip 
fracture-related deaths were caused by falls, we could estimate that about 
11,824 deaths due to falls resulted from only one type of injury. Mortality 
data from 1987 reveal that 8602 persons aged 65 or older died because of a 
fall. These and other data suggest that we have a major problem of un- 
dercounting deaths caused by falls (28). 

Most falls among the elderly population result in minor or no physical 
injury; only a small percentage of falls cause severe injury, such as a fracture 
(67, 99). An estimated 25-35% of older persons fall each year (67, 98), and a 
higher annual incidence is reported among older persons who live in residen- 
tial institutions (87). An estimated 3-6% of falls result in a fracture for 
persons living in the community and in nursing homes; 1% or less results in 
hip fractures (35, 67, 87, 99). Public health practitioners who focus preven- 
tion programs on elderly health should also consider systematically monitor- 
ing community- or nursing home-dwelling older persons for fall injury events. 


Risk Factors Related to the Host 


Several host factors may alter the risk of falls and fall injury events. Listed 
below are some of the factors that might help us identify high-risk individuals 
and develop screening techniques or prevention efforts (Table 2). 


AGE AND SEX In most studies of both community and institutionalized 
populations, researchers find that the risk of falling and being injured in- 


Table 2 Potential risk factors for falls among the elderly that may 
help public health practitioners target intervention programs 








Host Agent Environment 





Age and sex Mechanical energy _ Lighting 
Osteoporosis Impact position Stairs 

Chronic diseases Impact location Rugs and flooring 
Gait and balance Bathtubs 

Vision Shelving 

Mental status Footwear 

Medication use Streets and walkways 
Alcohol use 
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creases with age in both sexes and is greater in females than males at most 
ages (9, 29, 73, 89). For example, older persons tend to have poorer re- 
sponses to injury events than younger persons (5). With aging, physiologic 
changes occur in articular cartilage, bone, ligaments, and musculature (95). 
These changes can lead to osteoporosis, arthritis, decreased muscle strength 
and mass, decreased joint flexibility, decreased collagen elasticity and 
strength, and general discomfort and pain. Individuals with these changes 
might respond more slowly during difficult or emergency situations or devel- 
op early and excessive fatigue, which might lead to an injury (14). In 
addition, the most effective energy absorber in the human body, the active 
musculature, depends mostly on muscle strength, which decreases with age 
(95). During an injury event, therefore, these changes in the musculoskeletal 
system can lead to a decreased ability to withstand the effects of mechanical 
energy. Much of this variation in fall risk, however, may be due to the 
biologic and functional variability within age groups, rather than to simple 
age-dependent variations (74). If exercise and general muscle conditioning for 
older persons prove effective, public health practitioners could include these 
programs with others targeted to older persons. 

Women and men may have different outcomes from a fall for several 
reasons. For example, osteoporosis may play a substantial role in hip and 
other limb fractures for women (19, 50, 59, 61). On the other hand, women 
might fall differently than men and absorb mechanical energy at different 
parts of the body (hip) than men (head) (89). 


OSTEOPOROSIS Among older persons, osteoporosis decreases bone resis- 
tance to mechanical energy, which increases the risk of compression fractures 
from a given force (14) and predisposes to fractures of the hip, vertebrae, 
distal forearm, and pelvis, especially in older, white women (19, 50, 59, 61). 
Considerable controversy exists over the relative importance of osteoporosis 
in the etiology of hip fracture. Some investigators have argued that older 
persons with hip fractures are no more osteoporotic than noninjured persons 
of similar ages (17, 18). By using biomechanics research, Lotz & Hayes (53) 
have shown that about one twentieth of the energy that is needed to break a 
hip may be available during a typical fall from standing position. On the other 
hand, other researchers state that data on osteoporosis and hip fracture have 
been misinterpreted and that the measurement of osteoporosis through bone 
densitometry may be used to predict the propensity for fracture and, therefore, 
be useful as a screening tool (43, 85). They also note that the use of estrogen 
replacement therapy (ERT) in peri- and postmenopausal women retards the 
development of osteoporosis and reduces the risk of hip fractures in older 
women (1, 23, 46, 104). 

The use of densitometry as a basic screening tool to identify persons at high 
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risk of hip fracture, however, is premature for several reasons. First, few, if 
any, longitudinal studies have observed the rate of hip fractures among 
women with baseline bone mass determinations at menopause (30). Without 
these data, determining the impact of risk factor modifications is problematic 
(7, 18). Second, most studies that have demonstrated a reduction in hip 
fracture risk among women who have ever used estrogen have not included 
women older than age 74, the age group at highest risk for hip fracture (1, 46, 
104). One recent Swedish cohort study demonstrated a reduction in the rate of 
trochanteric hip fractures among women who had taken estrogen for up to five 
years before age 60; no effect has yet been seen for older women (63). Third, 
’ ERT’s increase in the risk of endometrial and other cancers and its side effects 
may outweigh the risk of hip fractures for women considering the use of ERT 
(18). If clinical trials confirm the protective effect of ERT among women at 
risk of heart disease (32a), then screening for osteoporosis may play a limited 
role in a woman’s decision to use ERT (18). Finally, for the public health 
practitioner, body mass index (weight in kilograms divided by height in 
meters squared) appears to be highly correlated with bone mass, as de- 
termined by densitometry (M. Nevitt 1991, personal communication). 

The determination of the mechanism of falls and injuries and ways to alter 
other risk factors will likely have a greater impact on hip fracture than the use 
of ERT by older women (61). 


CHRONIC DISEASES Cerebrovascular, cardiovascular, and neurologic dis- 
orders may increase the number of falls among older persons (44, 51, 52, 88). 
In a recent population-based study (89), the most common concurrent medical 
diagnoses associated with a fall injury event were syncope (16%), conduction 
disorder/dysrhythmias (15%), chronic ischemic heart disease (9.3%), anemia 
(8.7%), diabetes (8.3%), and hypertensive disease (8.2%). The prevention 
and amelioration of these chronic ailments through chronic disease prevention 
activities could lead to a substantial decrease in the future number of falls and 
fall injury events among older persons. 

Some investigators suggest that the risk of falling increases with the 
number of these conditions present, especially those that impair sensory, 
cognitive, neurologic, or musculoskeletal functioning (98, 100). Although 
these conditions do contribute to the occurrence of a fall, either through their 
physiologic effect or through a joint effect with environmental hazards, each 


chronic disorder probably does not contribute the same amount of risk to 
falling. 


GAIT AND BALANCE Gait and balance abnormalities have been repeatedly 
implicated in falls among older persons (41, 67, 80, 99, 100). These 
abnormalities may be related to changes in age, disease, or medication use or 
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to dysfunction of the nervous, skeletal, circulatory, or respiratory systems 
(87). Clinically, the older person with a history of falls often has a.stiff, 
uncoordinated gait and poor control over posture and body position (22, 87). 
An increasing number of clinical and laboratory measurement tools are 
available for assessing the complex neuromuscular functions of gait, balance, 
and postural control. Unfortunately, we do not yet have a way to use these 
clinical and laboratory measurements to develop easily administered screen- 
ing procedures for public health practitioners. In addition, we do not know if 
physical retraining through exercise, muscle strengthening, or some other 
mechanism will help decrease fall injury events among older persons (see 
Buchner et al, this volume). The ability to influence corrective and protective 
response through training and education should also be investigated. An 
understanding of how specific gait and balance problems transform environ- 
mental features into “fall hazards” would help us focus our environmental 
intervention efforts. 


VISION Impaired visual acuity and depth perception have been associated 
with an increased risk of falling and fracturing a hip (25, 67, 100). Visual 
acuity might be very important in maintaining postural control among persons 
with neuromuscular disorders (3, 13, 25, 49, 70). Visual acuity, depth 
perception, contrast sensitivity, peripheral vision, visual perception, dark 


adaptation, and glare tolerance are all involved in the detection and avoidance 
of environmental hazards and can become affected by age-related vision 
changes, cataracts, macular degeneration, and glaucoma (98). Early detection 
and treatment of common conditions, such as glaucoma and cataracts, should 
improve visual function and might reduce falls (2). Recent study findings, 
however, implicate topical eye medications as increasing the risk of falling 
among a selected group of elderly glaucoma patients (32). Whether this effect 
is real or a manifestation of other chronic conditions or disease-drug in- 
teractions needs further investigation. 


MENTAL STATUS Impaired mental status and depression are associated with 
an increased risk of a fall injury event (6, 9, 67, 73, 99, 105). This association 
may be related to the increased exposure to hazardous situations, because of 
confusion, impaired judgment, distraction, agitation, and lack of awareness. 
Associated gait and balance deficits and psychomotor depression may also 
increase the chance of falling. Antidepressant and sedative medication used 
for these conditions contribute to the increased risk of falls and fall injury 
events (78). Simple screening tests, such as the Mini-Mental Test (31), are 
available to determine the person’s degree of cognitive impairment. We do 
not yet know, however, which interventions can reduce the incidence of fall 
injury events in this high-risk group of older persons, while maintaining their 
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highest level of cognitive functioning. Also, because high rates of suicide 
occur among older white men (58), clinicians must use caution in changing 
the patterns of antidepressant medication use to reduce the incidence of fall 
injury events. 


MEDICATION USE Ray et al (78) analyzed Michigan Medicaid data in a 
large and weli-designed study and found a significantly increased risk of hip 
fracture in older persons currently taking long half-life psychotropic medica- 
tions. They estimated that about 14% of these hip fractures were attributable 
to current use of psychotropic medications. These medications (including the 
widely used benzodiazepines, barbiturates, phenothiazines, and tricyclic anti- 
depressants) may act by decreasing alertness, affecting judgment, com- 
promising neuromuscular function, or causing dizziness and syncope. Mac- 
Donald & MacDonald (56) found a substantial excess of barbiturate use 
among hip fracture patients who sustained the injury at night, compared with 
those whose fractures occurred during the day. Studies of falling suggest that 
recent use of any psychotropic medication may be associated with an in- 
creased risk of falling (105). Tinetti et al (99) noted an increased risk of 
falling among persons who use some psychotropic medications, but the 
prevalence of psychotropic medication use among nonfallers was low—only 
one of 228 nonfallers were users—compared with prevalence reported in 
other surveys (67, 78). 

In several studies, however, investigators have failed to find a relationship 
between psychotropic drug use and falling or fracturing a hip (34, 67, 71). 
One major reason for this discrepancy in the findings is the possible effect of 
an underlying condition, such as dementia, or the effect of drug-disease 
interactions, such as psychotropics and dementia. For example, in recent 
analytic studies (78, 99) that included cognitively impaired persons, research- 
ers found an increased risk of falling or fall injury events associated with drug 
use, whereas the findings of studies that specifically excluded persons with 
cognitive impairment revealed no increase in risk (34, 67). 

Falls related to multiple drug use may be an important problem (55, 98). 
For example, Buchner & Larson (8) found that patients with Alzheimer’s 
disease increased their risk of falling with the increased number of drugs 
taken. Physicians, pharmacists, and public health practitioners need to help 
older persons eliminate outdated medications better and monitor medication 
use more closely to prevent drug-drug interactions that can cause falls. 

Diuretics or antihypertensives might contribute to falling through fatigue, 
volume depletion, decreased mental alertness, or postural hypotension (98). 
Some researchers, however, have shown that use of thiazide diuretics might 
actually decrease the risk of a hip fracture, by decreasing urinary calcium 
excretion (27, 47b, 77). Because of the relatively high incidence of metabolic 
and other side effects associated with thiazide diuretics, these drugs are being 
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replaced by other antihypertensives. Nevertheless, further work is warranted 
to investigate the preventive aspects of thiazides. 

In summary, the balance of evidence suggests that psychotropic drugs play 
a role in the risk of falling among older persons. Effective reduction in 
physician prescribing practices for long half-life benzodiazepines has been 
accomplished through educational efforts (76). Also, New York state has 
recently limited excess prescribing of benzodiazepines through regulation 
and, thus, has decreased their use by low-income older persons by 25% (57); 
however, the prescribing of less acceptable medications increased (103b). 
Further work on drug-disease interactions and dose-specific effects, however, 
is needed to define more accurately who among the users of psychotropic and 
other medications are at highest risk of injury (36). This work is critical to 
designing appropriate intervention efforts. 


ALCOHOL USE Alcohol use is frequently a factor in injury. Alcohol acts as a 
depressant on the central nervous system and may increase the risk of falling 
and fall injury events by adversely affecting gait, balance, and cognition (72). 
Alcohol use has been frequently associated with falls in persons younger than 
age 65 (26, 39, 40), but most studies have not shown an association for older 
persons (34, 67, 73, 99). One study presents only nonanalytic evidence of an 
association between alcohol use and the risk of falling among older persons 
(103). This lack of evidence might reflect differential survival, because heavy 
alcohol use is strongly associated with premature mortality from a variety of 
causes (75). Although alcohol use is not associated with an increased risk of 
falls or fall injury events among older persons, the chronic use of alcohol 
interferes with tissue regeneration and immunologic function. An older per- 
son who drinks can, therefore, have a more severe outcome than a nondrinker 
who experiences the same injury event (14). In addition, the chronic use of 
alcohol can lead to various chronic medical conditions that predispose a 
person to sustain a fall or fall injury event. 


Risk Factors Related to the Agent 


Although we know much about the host and the environment, we know very 
little about the mechanism or transference of energy during a fall (14). 
Mechanical energy is the most common agent of injury due to falls among 
older persons (Table 2). Speed, violence, and concentration are key elements 
in transforming mechanical energy into an impact injury, which occurs by 
deforming tissue beyond its failure limits (102). Mechanisms that affect the 
risk of impact injury are the resistance of the body through inertial forces, the 
elastic capacity of the tissues, and the viscous tolerance of the body organs 
(14). Inertial forces from excessive acceleration of the skeleton lead to the 
tearing of an organ. An example is brain injury that results from the sudden 
acceleration of the skull during impact with the ground, with the loosely 
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attached brain lagging behind. Because of its elastic capacity, the body can 
absorb a tremendous amount of mechanical energy and protect organs through 
resistance to impact. This resistance of the human body has been demon- 
strated by persons who have survived falls from extreme heights (14). Older 
persons, however, tend to have decreased elasticity of tissues and organs, 
which can lead to fractures of the hip, ribs, and skull. Finally, viscous 
tolerance, the ability of organs to withstand rapidly applied strain forces, can 
be exceeded during high-speed impact, thus leading to contusion and possible 
rupture of an organ. For example, the heart may sustain damage when the 
sternum is rapidly and excessively moved during a motor vehicle crash. The 
same compression, occurring slowly, would not necessarily damage the heart, 
because the organ can tolerate gradual compression. 

Biomechanics is the discipline in which researchers investigate and explain 
the physical and physiologic responses to impact that result in injury (102). 
Better understanding can lead to protective devices for persons involved in 
potential injury events. We have yet to realize the tremendous, untapped 
potential in applying biomechanics to the control of injuries, other than those 
related to motor vehicles. Nevertheless, the biomechanics of falls and hip 
fractures have received growing attention over the last several years. One 
recent finding suggests that the position at impact, the location of impact, and 
the absorption of energy may be more important than the strength of the bone 
in determining the risk of hip fracture among older persons (53). 


Risk Factors Related to the Environment 


The environment has been implicated in one third to one half of all falls or fall 
injury events (54, 87, 92, 103). As early as 1950, Castle (10) implicated 
lighting and stair structure as causes of falls (Table 2). In 1955, Droller (21) 
implicated loose rugs and defective floors, and others (54, 83, 92) have 
implicated light switch hazards, thresholds, extension cords, slippery sur- 
faces, and other household products. Architectural design of stairways and 
homes and visual patterns on flooring can cause missteps and increase the risk 
of falling (3, 13, 49, 70). Recommended solutions have included use of 
slip-resistant stripping in bathtubs, proper placement of shelving, removal of 
throw rugs, redesign of stairs, improvements in shoe design, and im- 
provements in lighting (48, 81). These recommendations make intuitive 
sense, but nearly all of the studies on which these recommendations are based 
were descriptive, that is, they did not include valid comparison groups (81). 
Many of these studies have also specifically asked respondents what caused 
their falls, thus leading directly to interviewer and recall bias. Although 
environmental hazards probably contribute to falls and fall injury events in 
older persons, we do not know the extent of this contribution, how multiple 


potential hazards interact, and how this effect is modified by host and agent 
factors. 
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A particular problem in previous studies has been the instruments used to 
assess the environment. Most of those instruments have actually been empir- 
ically developed checklists that are based on the experience or area of interest 
of the investigator (12, 86, 96, 97, 101). They have also been unstandardized, 
have lacked definitions, and have not been evaluated to determine if they are 
measuring what we think they are measuring (validity) or if they are measur- 
ing it in a consistent manner (reliability) (81). The researchers using these 
instruments have also assumed that each hazard contributes equally to the 
hazard potential of the home, but no studies have confirmed this approach 
(actually, none have even addressed this issue). 

Most of these instruments have also tended to lack specifity. For example, 
most current instruments do not determine which areas of a home an older 
person ever uses or the amount of time spent there (81). Thus, these in- 
struments would categorize a room as hazardous, even if the hazard were 
present in an area in which the older person spends little or no time. Even if 
the room contained hazards, an older person might fall in a nonhazardous 
area, or the hazard might not be related to the fall. 

To determine the effect of environmental factors on falls and fall injury 
events, we should consider categorizing potential hazard exposures into 
persistent and variable exposures (81). Persistent exposures are those that tend 
to be fixed into the building or unlikely to change frequently over time, thus 
making direct measurement easy. Cabinets, flooring, stairs, and the absence 
of grab bars in the bathroom are examples of persistent exposures. Variable 
exposures are those that change frequently and, thus, make direct measure- 
ment difficult. Lighting, for example, varies considerably during the day, 
throughout the year, and in different rooms. Lighting, glare, and other 
variable exposures can be best obtained through self-report. For both types of 
exposures, predetermined definitions should be established for variables, 
including “use areas,” and staff should receive standardized training to eval- 
uate the environment. 

It is extremely difficult to compare homes of fallers and nonfallers, as there 
are many variations in room size and design. One useful approach is to 
develop a hazard index for an older person’s living arrangements (67). This 
hazard index should be based on a valid, reliable instrument and on those 
factors shown to increase the risk of fall injury events. Public health prac- 
titioners could then use a standard hazard index form in a standard way during 
each visit to an older person’s home. 


Surveillance 


Surveillance is a necessary activity to monitor health events on an ongoing 
basis (11). A surveillance system for falls should collect data that are repre- 
sentative of a defined population (11). Data from surveillance activities can 
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then be used to determine the need for public health programs and to assess 
their effectiveness. Due to the geographic variation in the distribution of falls 
and resultant injuries, state and local injury surveillance systems are critical to 
set local public health priority areas (84). 

Most falls do not result in injury or lead to medical care (35, 67, 87, 99). In 
addition, many falls and their resultant injuries may be forgotten by older 
persons, especially falls that resulted in minor or no injury (20, 87). It is 
clearly unrealistic for public health professionals to monitor all falls, regard- 
less of outcome, or to develop surveillance systems based on definitions 
derived from the Kellogg International Work Group (45), which are more 
suitable for in-depth clinical investigations of falls. 

A surveillance system based on the external causes of injury would lead to 
uniformity of data, thus allowing us to compare fall injury events by geogra- 
phic area. Unfortunately, E-codes are not routinely collected in medical 
records, except on death certificates (94). Thus, reliable estimates of the 
incidence of fall injury events in a population-based setting are not readily 
available to the public health practitioner. 

Currently, no national system of collecting data on the causes of nonfatal 
falls exists, although many states have hospital discharge data systems (94). 
Hospital discharge data systems contain many promising features of a useful 
surveillance system, including representativeness and specificity. E-coding of 
hospital discharge data has been recommended by the Council of State and 
Territorial Epidemiologists (16) and would fill the data gap between mortality 
and morbidity data (84, 94). Very few hospitals now use E-codes for injury 
information; however, in June 1991, the National Committee on Vital and 
Health Statistics unanimously passed the recommendation that E-codes be 
included soon in Uniform Hospital Discharge Datasets (65). 

The use of E-codes to monitor falls and fall injury events has several 
shortcomings, all of which can be improved significantly (65, 84, 94). 
Specifically, the medical record often contains insufficient information to 
code the external cause and the place of injury. This problem is due to the 
previous lack of a national requirement for E-coding in hospitals and the 
exclusion of E-codes in the current reimbursement system for hospitals. Thus, 
hospitals have had no incentive to record comprehensive descriptive informa- 
tion on injuries. By including a description of the mechanism involved, a 
statement of the intent of the injury, and where the injury occurred, hospitals 
could markedly increase the ability of the system to provide useful informa- 
tion. This information could be included as an important component of quality 
improvement programs for hospitals that care for injured patients. Physician 
training to promote better reporting, both in death certificates and hospital 
discharge summaries, would greatly improve the system. Finally, the index 
for E-codes should be revised to clarify definitions for medical records 
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personnel, and E-codes should be regularly refined and updated just as the 
other diseases in ICD-9-CM have been (65). Efforts to address all of these 
past problems associated with E-codes appear promising. 


FUTURE PREVENTION EFFORTS 


Although we have learned much over the last decade about the causes of falls, 
we still know little about the most effective ways of preventing their occur- 
rence. We have many promising leads, however. For example, many current 
efforts to prevent chronic diseases through smoking cessation, exercise pro- 
motion, and alcohol reduction programs may also lead to the prevention of 
many falls and injuries. 

Surveillance data should guide our prevention efforts. It can also help 
public health practitioners to describe the fall injury event problem in more 
detail, target high-risk individuals and high-risk areas, maximize use of 
limited resources, attract public attention to this problem, and monitor in- 
tervention strategies. 

After identifying persons at high risk for a fall injury event, public health 
practitioners can use Haddon’s matrix to conceptualize injury control options 
or minimize the consequences of injuries (Table 3). Haddon’s matrix sepa- 
rates the injury event into three distinct phases: preevent, event, and postevent 
(37, 38). Each phase of the Haddon matrix can also include information on 
the potential impact of the host, agent, and environment. The preevent phase 
of injury might be affected by removing or altering energy sources that have 
the potential to increase a person’s risk of falling, or by altering pathophysio- 
logic conditions that would enable an older person to cope better. Proper 
stairway design and lighting, better control over multiple drug prescriptions, 
exercise programs designed for general muscle strengthening, and homes 
specifically designed for older persons are examples. Technological develop- 
ment of energy-absorbing flooring would be useful in managing the event 
phase of injury. Networks of emergency response call buttons or buddy 
systems could improve overall survival in older persons who fall, but cannot 
get help quickly. 

It would also be useful for fall prevention efforts to be directed not only at 
older persons, but also at younger persons. For example, targeting young 
persons with smoking cessation, exercise promotion, and alcohol use reduc- 
tion programs may reduce both chronic diseases and a potential outcome of 
these diseases—falls. Educating perimenopausal women about calcium in- 
take, general nutrition, and the potential benefits and risks associated with 
estrogen use and teaching both middle-aged men and women about the need 
to maintain physical fitness and bone strength may reduce future injuries. 

Prevention efforts must balance the need to reduce risks with the need to 
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Table 3 Possible elements of a public health program to prevent falls among the elderly 
based on the Haddon matrix 








Phase Elements 





Preevent Exercise promotion and physical conditioning 
Cessation of smoking and alcohol 
Nutrition education 
Reduction of psychotropic medication use 
Regular eye examinations 
Osteoporosis prevention 
Hazard evaluations and modifications of residential institutions 
Home hazard awareness 
Use of proper footwear 
Safe outdoor walk routes during all weather conditions 
Event Energy-absorbing flooring for high-risk areas* 
Hip protection devices for high-risk persons* 
Postevent Emergency response call systems or buddy systems 
Improvements in emergency communication systems 
Health promotion and hazard prevention (all elements listed under preevent) 





“Currently under development. 


maintain mobility, functional activities, personal autonomy, and quality of 
life. Reduction in activity and mobility after a fall cannot, by itself, eliminate 
the risk of falling. Fear of falling and excessive restrictions in activity may 


initially reduce a person’s risk of falling, but may lead to increasing the risk 
over time by decreasing self-confidence and physical conditioning. 


CONCLUSION 


Public health practitioners must continue to rely on empirically derived 
interventions until effective prevention modalities are demonstrated for older 
persons (91). Clearly, more work is needed to determine which interventions 
can decrease the risk of a fall or fall injury event and how environmental 
factors interact with pathophysiologic processes, primary aging processes, 
and pharmacologic and behavioral factors in increasing or decreasing this 
risk. A need exists for better translation and dissemination by researchers of 
their findings to public health practitioners. Injury research must also include 
the principles of mechanics to investigate and explain the physical and 
physiologic responses to impact that result in fall injury events. 

Understanding both the many components associated with the increased 
risk of falls and the ways to modify the injury event so that it does not lead to 
morbidity or disability requires a multidisciplinary approach (e.g. behavioral, 
medical, public health, and engineering disciplines). This understanding 
would provide the public health practitioner with the scientific base needed to 
institute effective fall intervention programs. 
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INTRODUCTION 


In the United States, injuries are the most important cause of morbidity and 
mortality for persons between the ages of | and 44 years (59). Injuries are also 
an important cause of hospitalization and death in older adults, a problem 
often overlooked against the backdrop of cancer, cardiovascular disease, and 
stroke. This review outlines the epidemiology of nonfall trauma in older 
adults, including the incidence of the problem, special populations at risk, 
known risk factors, and proven and possible strategies for prevention. 

Before examining patterns of injury in older adults, we should define injury 
and modern injury control. The term “injury” refers to damage resulting from 
acute exposure to physical or chemical agents (2). The term “accident” is 
purposely not used because it connotes randomness and fatalism, as in 
“accidents happen.” The intent of the modern science of injury control is to 
reduce and control the damage from injuries, not to blame the victim or seek 
retribution for negligent or careless behavior (47). 

It is important to distinguish events from the injuries themselves. The latter 
does not necessarily follow from the former. A motor vehicle crash (MVC) 
may occur, but injury can be prevented if the occupants are safely protected. 
Thus, one can examine the factors that predispose to the event (preevent) and 
those that follow the injury (postevent) from the injury itself. Combining this 
dimension of examining injuries with the classic epidemiology paradigm of 
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host, agent, and environment, Haddon created a matrix for examining the 
causes and prevention strategies for injuries. The matrix displays preevent, 
event, and postevent phases on one axis with host, vector, physical environ- 
ment, and social environment on the other axis (2). 


NONFALL INJURIES IN OLDER ADULTS 


Overall Incidence 


In the US in 1985, 2.1 million adults aged 65 and older had a nonfall injury 
(7346 per 100,000 persons) (Table 1). Contributing to this injury total were 
the estimated 1.9 million persons with medically treated injuries who did not 
require hospitalization, 141,000 with hospitalized injuries, and 21,500 with 
fatal injuries (59). 

Injury rates vary with age and gender (Table 1). The most striking differ- 
ence by gender is found among fatal injuries; men have rates 2.5 times greater 
than women in both the 65-74 and 75+ age groups. The rates and numbers of 
total injuries and nonhospitalized injuries are higher in the 65-74 age group 
than the 75+ age group. Although the actual number of persons hospitalized 
for injuries is larger for persons aged 65-74, the hospitalization and death 
rates for nonfall injuries are greater for those aged 75 and older. 

Injuries as a cause of death in older adults may be underestimated, accord- 


ing to a study of multiple cause of death coding on death certificates (15). 
When an injury was listed on the death certificate as either a cause of death or 
a significant condition associated with death, the percentage with injury 


Table 1 Nonfall injury rates in older adults by age and gender, 
1985 (Rate per 100,000 persons) 








Age and 
Gender Total Fatalities Hospitalized © Nonhospitalized 





Total 7346 75 494 6775 
65-74 7793 60 419 5983 
St 6685 98 603 7312 


Men 
65-74 6186 
75+ 7259 


Women 
65-74 9033 36 
75+ 6373 60 





Calculated from data in Ref. 59. 
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identified as the underlying cause of death was more than 90% for MVCs, 
assaults, and suicides. Hoewever, for other types of injuries listed on the 
death certificate, less than 70% identified the injury as the underlying cause of 
death for those aged 65-75, and less than 50% for those 85 and older. 


Costs 


An important aspect of the impact of injury is the economic cost. Rice, 
MacKenzie, and Associates (59) estimated the lifetime economic cost of 
injury, based on the direct cost for medical treatment and rehabilitation, and 
the indirect cost associated with life years lost, including the loss of earnings 
due to short- and long-term disability and premature death. The lifetime cost 
estimate for the 2.1 million older persons injured in the US in 1985 was $5.1 
billion (Table 2). The direct expenditures for hospital care, physician ser- 
vices, nursing home care, drugs, and other medical and rehabilitation services 
account for $2.8 billion, 56% of the total lifetime costs. Indirect costs for 
morbidity and mortality are estimated as $1.8 billion and $441 million, 
respectively. In contrast to other ages, the economic burden of injury for older 
adults was greater for women than men. Mortality costs were similar in older 
men and women, but women had greater direct and morbidity costs. 


Change in Lifestyle 


Another very important measure of the impact of injury is changes in lifestyle 
as a result of trauma. Injuries in older adults may mean the difference between 
independent lives in private residences and dependent lives that require care in 
a nursing home. Adult children may force the older person to forgo driving 
after a motor vehicle crash. In Washington state, 20% of older adults who 
entered a hospital for a nonfall injury were discharged to a nursing home or 
intermediate care facility, rather than their own homes. 


Table 2 Lifetime cost of nonfall injury for persons aged 
65 and older, 1985 








Dollar Amount (millions) 
Indirect Cost 
Total Direct Morbidity Mortality 





Total 5054 2833 1780 44) 
Men 1841 1083 522 236 


Women 3213 1751 1258 





Calculated from data in Ref. 59. 
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CAUSE-SPECIFIC INJURIES 


The major causes of nonfall injuries in older adults are MVCs (occupants and 
pedestrians), suicides, assaults, and burns; each is discussed below. Poison- 


ings, drownings, and suffocations are less common in older adults and are not 
elaborated further. 


Motor Vehicle Occupants 


INCIDENCE Motor vehicle crashes are the leading cause of injury death and 
the second leading cause (after falls) of medically treated injuries and hospi- 
talizations (2). Although older adults appear to have low MVC rates com- 
pared with other ages, older adults drive about half as much as other age 
groups. Consequently, when the exposure of the number of miles driven is 
considered, drivers aged 65 and older have the second highest MVC rates 
after young adults. Drivers aged 85 and older have the highest crash rates per 
miles of travel (Figure 1) (8, 11, 24). 

Older drivers have different crash patterns, which usually involve errors of 
omission, than do younger drivers (8, 11). Older drivers are more likely 
involved in intersection and turning crashes and head-on collisions in urban 
areas. They are also more likely to commit right-of-way and signal violations 
and to be charged with inattention. Some studies report they are more likely to 
be responsible for the MVC in which they are involved than are other ages 
(41). Conflicting results have been reported for overinvolvement in backing 
and parking related crashes. Conversely, older drivers are less likely to be 
involved in single vehicle crashes or to be cited for reckless driving, driving 
too fast, or drinking and driving (8, 11). 

Older adults are more frequently admitted to the hospital and die from less 
severe injuries than younger persons. In one study, the proportions of hospital 
admissions that were for MVC trauma were 116 per 1000 emergency room 
visits for ages 65-74 and 248 per 1000 for age 75+, which are two and four 
times greater than the proportion of admissions for all other ages (3). Low 
severity injuries, as measured by injury severity scores (ISS), result in 
significant mortality for ages 70 and older, but are rarely fatal in younger 
persons (16). For example, 15% of MVC injury admissions with an ISS less 
than 20 died among those 70 and older, compared with less than 1% of 
individuals less than 50 years of age (2). 


RISK FACTORS 


Medical conditions Conflicting results have been reported for associations 
between chronic medical conditions and MVCs. Drivers aged 60 and older 
with medical conditions of diabetes, epilepsy, cardiovascular disease, 
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Figure 1 Motor vehicle crash rates per miles of travel by age. 


alcoholism, and mental illness reported to the California Department of 
Licensing had twice as many crashes per | million miles of driving than 
drivers with no reported disease (72). Among persons with medical conditions 
reported to the Washington State Department of Licensing who had licensing 
and/or driving restrictions, such as special equipment, area or time of day 
restriction, or periodic reexamination, men older than 65 with reported 
epilepsy had increased risks of MVC. However, no increased risk of MVC or 
traffic violations was found for older adults with diabetes or heart disease 
license restrictions (13). These results among the small number of older adults 
conflict with the overall study findings of higher MVC rates for persons of all 
ages whose licenses were restricted as a result of diabetes, epilepsy, fainting, 
and other reported conditions. Slightly higher MVC rates that were not 
Statistically significant were also found for drivers of all ages with heart 
disease restrictions. A recent study of diabetes and epilepsy in all ages, which 
attempted more complete ascertainment of persons with these medical con- 
ditions by identification from clinic and hospital records, found each condi- 
tion had a relative risk of 1.3 for MVC, and both had a relative risk of 1.6 for 
MVCs causing injury (20). 

A study comparing senile adults aged 60 and older to healthy 30—59-year- 
olds found that the older group had twofold greater MVC per miles driven; 
cardiovascular disease coupled with senility in the older adults increased 
MVCs fourfold (73). 
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Drugs For persons of all ages, psychoactive drugs have been reported in 
several studies to be a risk factor for drivers involved in MVCs (25, 39, 63); 
one study found no increase (27). Although no published study has evaluated 
drugs and MVC risk in older drivers, Ray’s study of Medicaid enrollees found 
current users of benzodiazepines, tricyclic antidepressants, or both agents had 
relative risks of 1.5, 2.2, and 2.1, respectively, for MVCs compared with 
nonusers (W. A. Ray 1991, unpublished observations). An increased risk was 
not found for current users of opioid analgesics or antihistamines. 


Alcohol Alcohol does not appear to be as major a risk factor for MVC in 
older adults as it is for younger persons. Of the approximately 6.6 million 
police-reported MVCs that occurred in the US in 1989, 1% of the crash- 
involved drivers aged 65 and older were reported by police as using alcohol, 
compared with 5% of drivers aged 21-24 and 25-34 (48). In fatal MVCs, 7% 
of drivers aged 65 and older had blood alcohol levels of 0.10 gm% of greater, 
compared with 35% of 20-24 year old drivers (46, 55). Of motor vehicle 
fatalities among persons aged 65 and older, 14% were alcohol related, 
compared with 52% of persons 20-24 (46). 

In an 11-state survey conducted by the AAA Foundation for Traffic Safety 
and the Safety Research and Education Project at Columbia University, 42% 
of drivers aged 55 and older reported that they did not drink alcohoiic 


beverages, compared with 17% of 30-45-year-old drivers (75). Only 0.4% of 
these older drivers and 1% of the younger drivers reported drinking more than 
one alcoholic beverage a day (75). The Alcohol Working Group of the 
Surgeon General’s Workshop on Health Promotion and Aging stated that 
although it is not possible to determine the prevalence of alcohol abuse, 
reported drinking appears to decline as the population ages (57). 


History of traffic infraction and MVC _ Several studies report that drivers 
with a history of repeated traffic infractions and/or crashes may be at in- 
creased risk for subsequent MVC (34, 35), but the usefulness of driving 
history has been questioned, as the repeat offenders (violations and crashes) 
account for a small percentage of all crashes (33, 66). The role of either of 
these factors as a risk for MVC among older drivers is unknown. 


PREVENTION The strategies to prevent and reduce injury and death from 
MVC in older adults should be multifaceted. Control requires intervention at 
several levels, including federal, state, and local, and involves changes in the 
host, the agent, and the environment (47). Prevention approaches should 
include prevention of the occurrence of MVC, prevention of injury once 
MVC occurs, and prevention of adverse outcome when injury does occur. 
One of the tenets of modern injury control is that changes in the environment 
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or products are more likely to result in a reduction of injuries than is a focus 
on behavior change (2, 47). Current prevention approaches include occupant 
protection, driver education, improved emergency medical and rehabilitation 
services, and license renewal changes. 


Occupant protection The most effective interventions for motor vehicle 
occupants are seat restraints, including lap and shoulder belts and air bags. 
Studies have shown that seat belt restraints are 45% effective in preventing 
MVC fatalities and 50% effective in preventing moderate to critical MVC 
injuries (52). Not only do restraints reduce the risk of ejection, but they also 
prevent or mitigate the second collision of the occupant with the vehicle. New 
passenger cars are now required by federal vehicle safety standards to be 
equipped with shoulder/lap belts for front seat occupants and lap belts for rear 
seat occupants. Laws requiring front seat occupants and/or all passengers to 
use seat belts were enacted in 33 states and the District of Columbia by 1990 
(46). 

Overall usage of seat belt restraints has increased from less than 20% 
nationally in 1983 to the 49% estimated from observations of drivers in 19 
cities in 1990 (49). Actual restraint use among older adults is unknown. In a 
1989 seat belt observation study, seat belt use for drivers aged 50 or older was 
45%, compared with 47% in drivers aged 25—49 (50). Older women drivers 
(53%) were more likely than men (41%) to buckle up (50). 

Older adults report the main reasons for not using seat belts are difficulty of 
use and lack of comfort (75). The approaches to increasing restraint usage 
among older adults should include manufacturing changes for more easily 
reached, attached, and released belts and education/promotion of seat belt 
use. However, because passengers have failed to wear seat belts, air bags 
were developed and have been shown to be very effective in frontal collisions 
(47). Automatic seat belts and air bags in new automobiles will help reduce 
MVC injury. 


Driver education In the late 1970s, the American Association of Retired 
Persons and the National Retired Teachers Association began sponsoring 
driver education courses for adults 55 and older. These refresher driver 
education courses, which attempt to update driving knowledge and refresh 
skills, are offered throughout the US by driving schools, automobile and 
safety organizations, and older adult organizations, usually in cooperation 
with state motor vehicle departments. Automobile insurance companies offer 
discount rates to older adults who complete the course. The effectiveness of 
these courses in improving driving is hard to assess, because of possible 
self-selection of better drivers who take the courses. Participants of California 
mature driver improvement courses had significantly lower rates of fatal or 
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injury collisions and traffic convictions in the six months after the course than 
comparison drivers after adjusting for age, gender, license class, prior record, 
and area of residence (69). 


Improved emergency medical and rehabilitation services Interventions in 
the postevent phase are important in both reducing mortality and decreasing 
morbidity and disability among the survivors. Emergency medical service 
systems and regionalized trauma care have been effective in reducing the 
number of preventable deaths that occur in motor vehicle trauma and other 
types of serious injury (47). Rehabilitation care can enhance the degree of 
recovery attained by the trauma victim and, for the older adult trauma victim, 
may mean the difference between returning home to resume an active life or 
being discharged to a nursing home or other intermediate care facility. 


License renewal changes In the US, the frequency of state driver’s license 
renewal varies from one to five years, and approximately 20% of the states do 
not require any vision screening with renewal (18, 29, 74). Approximately 
20% of the states have increased, but varied, frequency of renewal regulations 
for older drivers (18). The varied policies for older drivers reflect inadequate 
scientific data on who is at high risk for MVC among older drivers. Studies 
are needed to assess whether such factors as medical conditions, drugs, 
sensory impairment, or driver history can help identify high risk drivers. 
Results from such studies would be useful for developing guidelines to screen 
drivers for more frequent and comprehensive license renewal. Such studies 
would also be helpful for clinicians and health educators so that they can 
better counsel and inform older adults about the safety of their driving. 


Other prevention strategies New prevention programs need to be de- 
veloped, implemented, and evaluated. Long-term approaches might include 
license renewal changes, improved vehicular and roadway design, and im- 
proved public transportation alternatives. 


Pedestrians 


INCIDENCE Since 1979, 14-17% of motor vehicle deaths in the US oc- 
curred to pedestrians, the second largest group of MVC after occupants. For 
persons aged 65 years and older, pedestrian injuries account for 22% of 
traffic-related deaths (26). Older adults have the highest pedestrian death rates 
of any age group, 4.7 per 100,000 persons in 1989. The rate for those aged 80 
and older is more than twice as high as it is for persons aged 70-74 and 
younger persons. The pedestrian death rate for men is two to four times as 
high as for women in older ages, a pattern similar to the gender differences 
found at all ages. 
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Pedestrian-motor vehicle collisions (PMVCs) are qualitatively different 
from other types of motor vehicle-related trauma, as very few victims escape 
injury. Based on data from the National Highway Traffic Safety Administra- 
tion, only 1.1% of pedestrians struck by a car are uninjured (51). In contrast, 
94% of all MVCs involve no injury (52). The pedestrian injury rates in older 
adults are higher than those found in younger adults and second only to the 
rate in children when distance traveled, number of street crossings, and time 
spent as a pedestrian are considered (28, 70). 

Two thirds of all pedestrian deaths occur in urban areas. For older adults, 
33% of pedestrian fatalities and 50% of pedestrian injuries occurred at 
intersections, compared with 10-18% (fatalities) and 26-39% (injuries) for all 
other ages. Older pedestrian fatalities are more likely to occur during daylight 
(21). In contrast, pedestrians aged 26-64 are more likely to be fatally injured 
at night. 


RISK FACTORS There is a paucity of epidemiologic data on risk factors for 
pedestrian injuries and fatalities in older adults. In children, the other high- 
risk pedestrian group, male gender, low socioeconomic status, pedestrian 
action, and environmental factors have been identified as risk factors (44, 60). 
A case-control study of environmental factors found that children who lived in 
multifamily housing or in housing without yards had a more than fivefold 
increased risk of injury. Characteristics of the site that increased risk of injury 
were more than two lanes of traffic, high traffic volume, posted speed limits 
over 25 mph, and the presence of a marked crosswalk. The findings of an 
increased risk of injury for marked crosswalks has been shown in other studies 
(22). 

Few studies of environmental factors have specifically been designed or 
evaluated for their effect on the risk of PMVC in older adults. The data for 
crosswalks, pedestrian signals, timing of signals, and right turn on a red light 
are reviewed below. 


Crosswalks Data regarding the safety benefits of crosswalks are conflicting. 
Herms reported a twofold increased risk of PMVC in marked crosswalks, 
controlling for differences in pedestrian volume (22). Older adults and chil- 
dren were found to have the greatest risks. However, this study did not 
address possible differences in age distribution of pedestrian volume, traffic 
volume, or other factors that may have contributed to the reason that the 
intersections were originally marked or signalized. Hauer (21) notes contrast- 
ing evidence, which uses a different outcome measure, comes from a study 
cited in the 1965 ITE handbook. This study found that painted crosswalks 
reduced pedestrian right-of-way violations. In addition, Tobey et al (70) 
reported that absence of crosswalk marking was a relatively hazardous in- 
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tersection characteristic for pedestrians. Knoblauch et al (30) found that sites 
with marked crosswalks were safer than unmarked crosswalks. Marked cross- 
walks may provide pedestrians with a false sense of safety and may actually 
increase the risk of pedestrian injury. 


Right turn on red Allowing right turn on a red light has consistently been 
shown to increase the risk of pedestrian injury. In a study of the effect of right 
turn on red laws, adopted during 1974-1977, Zador (76) found the greatest 
increase for right turn PMVCs, 117%, was among adults aged 65 and older. 


Pedestrian signals Zegeer et al (77) reported intersections with standard 
pedestrian signals had more PMVCs than intersections without signals. 
However, these authors suggest that signalized crosswalks are most beneficial 
at low speed intersections with traffic signals. This study did not consider 
possible differences, such as pedestrian volume, age, or other reasons that 
intersection marking and signalization varied. In England, pedestrian cross- 
ings that had crosswalk markings and pedestrian activated traffic lights had 
half the risk of pedestrian injury, compared with crossings without such lights 
for the same levels of vehicle and pedestrian volume (71). 

The assumed walking speed of 4 feet/second cited by the Manual on 
Uniform Traffic Control Device is the standard by which traffic engineers 
base the timing of pedestrian lights. From observations of adults aged 70 and 
older who were instructed to cross an intersection at a normal, comfortable 
speed, Dahlstedt found that almost 90% crossed at less than 4 feet/second 
(11). Lundgren-Lindquist et al (36) observed a comfortable walking speed 
mean of 3.4 and 3.0 feet/second for samples of 70-year-old men and women, 
respectively. Although older adults appear to have slower walking speed than 
current standards for timing of lights, the association between older adult 
PMVCs and timing of light has not been evaluated. A report from the Institute 
of Traffic Engineers recommends that a walk speed of 2.5 feet/second would 
provide adequate crossing time for 87% of older pedestrians (11). 


Other environmental factors Other factors that are relatively hazardous for 
pedestrians of all ages are major arterials, two or more traffic lanes, length of 
the block (250 feet or less), left turn channelization, no sidewalks or curbs, no 
street lighting or regularly spaced street lighting, residential and mixed use, 


and “T” intersection type (a street ending with outlets onto perpendicular 
street) (70). 


PREVENTION Pedestrian injuries, like MVC occupant injuries, are a com- 
plex problem for which multiple prevention approaches are necessary. 
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Environmental modifications Research is necessary to identify and prioritize 
which environmental factors should be targeted for development of interven- 
tion programs for older adults. In the Queens Boulevard pedestrian prevention 
project, Retting et al (58) reported that the occurrence of fatal and near-fatal 
pedestrian injuries, especially among older adults, decreased by 43% and 
86%, respectively, two years after implementation of interventions. These 
included stop light changes to increase pedestrian crossing time, modified 
roadway markings, pedestrian signals on median islands, tighter speed limit 
enforcement, and safety education presentations at senior centers. No adjust- 
ment was made for pedestrian volume in the study area. Citywide occurrence 
of fatal pedestrian injuries decreased by 4% over the same time period. 


Vehicle design The majority of injuries occurring in PMVCs are a result of 
the pedestrian being “run under” and thrown up onto the automobile, rather 
than being “run over” and having contact with the wheels or ground. Changes 
in motor vehicle design have been effective in decreasing the incidence and 
severity of motor vehicle occupant injuries. Research over the last decade 
indicates that similar design changes to the exterior of the vehicle (bumper, 
hood, and windshield) can potentially reduce by one-third the risk of serious 
injury to pedestrians who are struck (1, 38). 


Legislation and enforcement Legislation has been an effective component of 
many injury control programs, especially those related to motor vehicles. The 
extent to which laws and enforcement play significant roles in pedestrian 
safety, however, is a question that remains to be answered. Enforcement may 
be a key component. Lack of understanding of pedestrian laws and their 
enforcement may be a major reason for noncompliance by drivers in many 
areas of the country. Of 41 states surveyed, with the notable exception of 
California, enforcement of pedestrian safety laws receives little emphasis and 
is regarded as politically difficult (19). 

In many states where laws are enforced, the emphasis traditionally has been 
on the pedestrian, rather than the driver. In Pennsylvania, changes in the code 
in 1977 resulted in virtually no change in driver behavior with regard to 
pedestrians, because the law went virtually unenforced (19). In Seattle, 
Washington, increased enforcement in the six months beginning September 
1990 has resulted in 3625 tickets at $47 fine each, written against motorists 
who ignore persons in crosswalks, plus 1990 tickets at $19 fine each for 
jaywalkers. The effect of these measures is not yet known. 


Pedestrian skill training program Several promising reports on programs 
designed to improve children’s pedestrian skills have appeared in the litera- 
ture (47, 60). The literature is also replete with examples of programs that 
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have been ineffective. No similar programs for older adults have been un- 
dertaken. Such programs might be needed to help older adults change their 
style of crossing as they age. Again, research is necessary to help illuminate 
the focus of pedestrian skill training programs for older adults. 


Suicide 


INCIDENCE When we hear of suicide, we often think of the adolescent 
male. However, the highest rates of suicide are in men aged 65 and over; 
those aged 85 or older have rates (67/100,000) two- to threefold greater than 
those of adolescents and young adults (Figure 2) (45). Rates in older adult 
men are four- to ninefold higher than in women, with highest rates among 
whites, lowest rates among blacks. The rates among Asians and Hispanics are 
intermediate between the two. 

The number of nonfatal suicide attempts in older adults is not well known. 
There appear to be fewer attempts compared with completed suicides in the 
elderly than in persons younger than 40 years of age: 4 to 1 compared with 20 
to 1 (4). 

In a population-based survey in Washington state, completion rates for 
suicide were age-dependent, ranging from 3.3% for teenage girls to 79% for 
older men aged 65-74 (67). 


RISK FACTORS 


Marital status Durkeim first emphasized the association between widow- 
hood and suicide 100 years ago; he described widowhood as “domestic 
anomie.” Married persons throughout life have the lowest rate of suicide (65). 
Widowed and divorced older adult men and women have rates of suicide that 
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Figure 2 Rates of suicide in the US, 1987. 
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are two- to threefold greater than their married counterparts, which indicates 
the need for special intervention programs in these groups. 


Physical illness Chronic disabling illness may radically alter a person’s 
sense of self-worth, as well as result in a loss of independence and increased 
isolation (23). Some studies suggest that older adult suicide victims have 
more physical disorders of metabolic, respiratory, and cardiovascular origin 
than older adults who do not attempt suicide (62). 


Depression and mental illness Depression is the most common mental 


health symptom in the over 65 age group and is considered a strong risk factor 
for suicide (23). 


Availability of means The availability of lethal means may convert an 
attempt into a successful suicide. Older adult men most commonly choose 
guns as their means, which may explain the high rate of completed suicides in 
this group. In a study in Arizona, more than one fourth of older adult suicide 
victims had obtained the firearm in the month before the suicide (43). 


PREVENTION Prevention of suicide among the high risk older population 
has been sorely neglected. Formal suicide prevention and psychological 


services are under-utilized by older adults, who generally represent only 
1-2% of the caseloads of suicide prevention centers (40). Prevention of the 
problem must be multifaceted. Some potential strategies include the follow- 
ing: 


Treatment of prior attempters The single most important risk factor for 
suicide at any age is a history of a prior attempt. Treatment, ranging from 
outpatient therapy to inpatient hospitalization, is the primary intervention for 
those who attempt suicide (47). 


Restrict availability of lethal means This is a complex area for which 
definitive answers are not available. Elimination of carbon monoxide from 
gas for cooking and heating in Britain resulted in a reduction in the overall 
rates of suicide (32), whereas the same changes in Australia (5) and Bern, 
Switzerland (68), resulted in an offsetting increase in suicides by other means. 
Comparison of Seattle to Vancouver, BC, revealed equivalent rates of suicide 


among the elderly, despite more restrictive gun control laws in Vancouver 
(64). 


Education of health professionals Approximately 75% of older adults who 
kill themselves see a physician shortly before committing suicide (54). Many 
older adults do not see the physician with suicidal ideation as their chief 
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complaint; instead, they present with somatic or depressive symptoms. Physi- 
cians need to be alert to such risks. 


Services to decrease social isolation in the older adults More global societal 
changes may be necessary to decrease suicide rates in this age group. Allow- 
ing individuals to work as long as they are able and making retirement a 
gradual process that involves counseling may decrease the sense of loss felt by 
many elderly at retirement (47). Programs to increase visitors for older adults, 
similar to those that have been developed for mothers at high risk of child 
abuse, may be necessary in a society in which economics and employment 
preclude older adults living with or even in the same city as their adult 
children. 


Burns 


INCIDENCE Burns are the fourth leading cause of injury death in the US, 
accounting for approximately 5000 deaths each year (2). An additional 
90,000 patients are hospitalized annually for the treatment of burns (2, 10, 
59). The causes of deaths from burns are very different from those that result 
in hospitalizations, thus requiring different preventive strategies (2). In both 
situations, older adults and young children represent populations at increased 
risk. 

Almost 90% of burn and fire deaths occur in residential fires (2). Residen- 
tial fire deaths are highest in the South and lowest in the West (7). In all areas, 
residential fire deaths are much higher for adults over age 64 than they are for 
individuals aged 5S—64 (Figure 3). These rate differences are most pronounced 
in the South. 
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In contrast to fatal burn injuries, nonfatal injuries do not occur most 
frequently in the older adults. The incidence of hospitalization in Massachu- 
setts for flame/flash burns, scalds, and other burns was lower in those aged 55 
and older than in children under the age of 5 and those aged 5—54 (61). The 
same age trends were found for all burns that required emergency room care 
(10) and all burns that required medical care (37). 

These differences in relative incidence of fatal and nonfatal burn injuries 
reflect not only the difference in the etiology of these two major categories of 
burns, but also the increased case-fatality rate among older adults. Young 
children and older adults have far higher fatality rates than older children and 
adults for the same degree and extent of burn (9). 

Scald burns, usually caused by a hot food or drink, were the most common 
category of nonfatal burns in older adults (10). A subset of scald burns are 
those caused by hot tap water, accounting for approximately 10% of scald 
burns in this age group. Nonfatal flame burns often involve clothing ignition 
and/or flammable substances. One third of fabric ignitions among older adults 
happen in the kitchen, nearly always associated with cooking. Rossignol et al 
(61) noted that 62% of nonhouse fire burns among older adults involved 
clothing ignition, compared with 30% for all younger persons. 


RISK FACTORS _ Risk factors for burn injuries in older adults are likely to be 
the same as those in other age groups. Poverty is clearly one of the strongest 
risk factors for fatalities from residential fires. Mierley & Baker (42) reported 
a strong correlation between economic conditions and house-fire deaths for 
both blacks and whites. 

Cigarettes are estimated to cause 45% of all fires and 22-56% of residential 
fire deaths (47). Most cigarettes made in this country contain additives that 
cause the cigarette to burn for as long as 28 minutes, even if left unattended. 
Many of the individuals who smoke also consume alcohol, a known risk 
factor for house fires and deaths. 


PREVENTION Prevention strategies for burn injuries have been successful 
and can make a large impact on morbidity and mortality. 

Smoke detectors are perhaps the most inexpensive injury prevention strat- 
egy that one can implement (47). The majority of fire deaths do not involve 
burns, but rather smoke inhalation. Evaluation of smoke detectors reveals that 
they reduce the potential of death in 86% of fires and the risk of severe 
injuries in 88% (14). The effectiveness of smoke detectors is increased if a 
sprinkler system is also used, thereby markedly reducing the spread of a fire. 
Building regulations could be changed to require that all new housing is 
equipped with relatively inexpensive sprinkler systems. 

The technology exists to manufacture a self-extinguishing, fire-safe ciga- 
rette. The benefits of producing fire-safe cigarettes would far outweigh the 
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costs to the tobacco industry. If only fire-safe cigarettes were smoked in this 
country, nearly 2000 deaths and more than 6000 burn injuries would be 
prevented annually (47, 59). 

One success story in injury control has been that of efforts to reduce tap 
water temperatures. Lowering the water-heater setting to 125° F can prevent 
most tap water-related scald burns (2). The appliance manufacturers have 
tentatively agreed to voluntarily preset all new heaters at the lower tem- 
perature. Many utilities offer free home services to adjust water heater 
settings. 


Assault 


SCOPE OF PROBLEM In 1987, 1330 persons aged 65 and older died from 
assault (45). However, these figures far underestimate the true nature of the 
problem. Just as the 1960s brought to light the problem of child abuse and 
neglect, the 1980s taught us that domestic violence against older adults was a 
common and tragic problem. 

There are no national studies on the extent of this problem. Data are limited 
and come from small prevalence surveys. In the Major Trauma Outcome 
Study of patients admitted to 111 US and Canadian hospitals for the care of 
trauma, stab and gunshot wounds accounted for 8.1% of admissions in those 
aged 65 and older (9). These injuries had very high case-fatality rates: 17.3% 
and 52.1%, respectively, of those assaulted died from their injuries. Men and 
women were equally represented. 

In a community-based survey of older adults in Boston, 2% reported they 
had been physically abused, the most common form of maltreatment suffered 
in this age group (56). Only a small fraction of this abuse appears to come to 
public attention. In Massachusetts, only one of 14 cases was reported (56). 
Thus, official estimates of older adult abuse and assault far underestimate the 
extent of the problem. 

The abuser is a relative in 86% of cases and lives with the older adult in 
75% of cases (12). Approximately 50% of older adult abusers are children or 
grandchildren of the victim, and about 40% are spouses. 


RISK FACTORS Few analytical studies of assaults to older adults have been 
conducted; knowledge of risk factors comes nearly entirely from case series. 
In the Boston survey, men were three times more likely to be physically 
abused than women, and those in poor health were fourfold more likely to be 
abused than those in excellent health (56). Many other studies have found 
women to be a greater risk than men. However, women appear to be more 
seriously abused than older adult men, and thus are more likely to come to 
official attention. Other studies have also reported an association of older 
adult abuse with physical and mental impairment (17). 
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Dependency on others appears to place the older adult at risk of abuse (12). 
Although dependency may not be the sole explanation, it may act as a trigger 
by creating stress on a caretaker who is poorly equipped to adapt and cope 
with this burden. 

It is unclear whether socioeconomic status is a risk factor, as older adult 
abuse appears to exist at all levels of household income (12). 

In some studies (31), advanced age appears to be a risk factor for abuse. In 
other studies (56), the risk of abuse was similar in those over age 75 compared 
with the 65-75 age group. 

Alcohol appears to be a risk factor for all domestic violence, including that 
involving older adults (31). Alcohol abuse may affect the victim, the per- 
petrator, or both. 


PREVENTION Prevention of older adult abuse and assault has only recently 
been attempted. In 1985, states spent an average of $22 per child for preven- 
tion and care of child abuse. However, only $2.90 is spent per elderly person 
for the prevention of abuse in this age group (6). No interventions have yet 
been evaluated; thus, none should be implemented without a research com- 
ponent (47). 


Risk assessment Because health providers are often the only professionals 
who interact with abuse victims, risk assessment tools for use by these groups 
may be useful in early detection and treatment. This can take the form of an 
informal assessment with knowledge of the common presentations of older 
adult abuse (53) or the use of more formal standardized screening instruments 
(47). Key elements are the history, physical findings, observation of the 
patient and/or individual accompanying patient, and whether the severity or 
natur= of the injury match the explanation of events. None of these methods, 
however, has been tested in a prospective fashion with attempts to identify 
false-positives and false-negatives. 


Community education campaigns Several communities have launched pub- 
lic education campaigns designed to increase community awareness of the 
problem of older adult abuse; however, none have been evaluated to date. 
Given the past performance of many community education campaigns, none 
should be widely implemented until proved to be effective (47). 


Community projects Many communities have developed programs to de- 
crease the isolation of older adults and help prevent abuse. These include 
programs in churches, businesses, and public and private agencies. Adult day 
care and respite care programs are examples of the concrete services provided 
by these community-based projects. Outcomes from these programs should be 
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monitored for their effects on reducing the rates of older adult assaults in the 
community. 


SUMMARY 


Nonfall injuries are an important cause of morbidity and mortality in older 
adults. In addition to the loss of life and human suffering, the economic costs 
and the changes in lifestyle are important aspects of the consequences of 
trauma. Rates of injury as a result of MVCs (occupant and pedestrian), 
suicide, and residential fire are higher in the younger and older segments of 
the population, as indicated by J-shaped or U-shaped curves. Domestic 
violence against older adults is a recognized, but not well investigated, 
problem. Although risk factors have been identified for some of the cause 
specific injuries, the continuation of epidemiologic research is important to 
elucidate risk factors, especially those for which interventions can be de- 
veloped. The development, implementation, and evaluation of the interven- 
tion programs are necessary for a multifaceted approach to injury control in 


older adults. 
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